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(57) Abstract 

The present invention provides a purified protein 
designated P/CAF having a molecular weight of about 
93,000 daltons as determined by sodium dodecyl sul- 
fate polyacrylamide gel electrophoresis under reducing 
conditions and which acetylates histones and which also 
binds to the p300/CBP cellular protein. The present 
invention further provides a nucleic acid encoding the 
P/CAF protein as well as a vector containing the nu- 
cleic acid and a host for the vector. A purified antibody 
which specifically binds the P/CAF protein is also pro- 
vided. Also provided are methods of screening for com- 
pounds that inhibit or stimulate the transcription mod- 
ulating and histone acetyltransferase activity of P/CAF 
and p300/CBP. 
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P300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF AND USES THEREOF 



BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

The present invention provides a transcriptional co-factor, p300/CBP-associated 
factor (P/CAF), which modulates transcription through binding to the cellular 
transcription co-factors p300 and CBP and through acetylation of histones. Also 
10 provided are methods for screening for the presence of P/CAF and for substances which 
alter the transcription modulating effect and growth regulatory activity of P/CAF. 



Background Art 

Cellular proteins p300 and CBP are global transcriptional coactivators that are 
1 5 involved in the regulation of various DNA-binding transcriptional factors (Janknecht and 
Hunter, 1996). Recently, p300 was found to be very closely related to CBP, a factor 
that binds selectively to the protein kinase A-phosphorylated form of CREB (3-5). 
Cellular factors p300 and CBP exhibit strong amino acid sequence similarity and share 
the capacity to bind both CREB and El A (6-8). Although neither p300 nor CBP by 
20 itself binds to DNA, each can be recruited to promoter elements via interaction with 
sequence-specific activators and functions to be a transcriptional adaptor. For 
simplicity, p300 and CBP will be termed p300/CBP in the context of discussing their 
shared functional properties. 



25 p300/CBP is a large protein consisting of over 2,400 amino acids, known to 

interact with a variety of DNA-binding transcriptional factors including nuclear hormone 
receptors (13,57), CREB (3,4, 7), c-Jun/v-Jun (9,1 1), YY1 (10), c-Myb/v-Myb (12,58), 
Sap-la (59), c-Fos (1 1) and MyoD (60). DNA-binding factors recruit p300/CBP not 
only by direct but also indirect interactions through cofactors; for example, nuclear 

30 hormone receptors recruit p300/CBP directly as well as through indirect interactions, via 
SRC-1, which stimulates transcription by binding to various nuclear hormone receptors 
(13,61). 
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The transforming proteins encoded by adenovirus and several other small DNA 
tumor viruses disturb host cell growth control by interacting with cellular factors that 
normally function to repress cell proliferation. One of the most intensively studied of 
these viral proteins, the product of the adenovirus El A gene, is itself sufficient for 
5 transformation (1). El A transforming activity resides in two distinct domains, the 
targets of which include p300/CBP and products of the retinoblastoma (RB) 
susceptibility gene family (1,2). Interactions of El A with p300/CBP and RB are 
thought to influence functionally distinct growth regulatory pathways, allowing the two 
domains to contribute additively to transformation (1). 

10 

The paradigm for how El A and functionally related viral proteins perturb cell 
growth regulation derives in large part from studies on their interactions with RB (1,2) 
The molecular function of El A is based on its capacity to interfere with cellular protein- 
protein interactions. Since both El A and various cellular targets bind to a site in RB 
15 termed the pocket domain (2), El A can competitively disrupt the complex formation 
between RB and its cellular targets. 

The second cellular factor implicated in El A-dependent transformation, p300, is 
believed to inhibit G0/G1 exit, to activate certain enhancers, and to stimulate 
20 differentiation (1,2). El A inhibits the p300/CBP-mediated transcriptional activation of 
many promoters (14). In one case that has been examined, the complex of p300 and 
YY1, El A inhibits transcription without disrupting the complex (10). 

The present invention provides a cellular protein designated P/CAF which binds 
25 to p300/CBP and plays an important role in both transcription and ceil cycle regulation 
associated with a histone acetyltransferase activity. The present invention also provides 
a histone acetyltransferase activity in the p300/CBP cellular protein, thus providing 
targets for modulating transcription and cell cycle regulation in cells. 



30 



WO 98/03652 



PCT/US97/12877 



3 

SUMMARY OF THE INVENTION 



The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
5 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones and which also binds to the p300/CBP cellular protein. 

The present invention further provides a nucleic acid encoding the P/CAF 
protein as well as a vector containing the nucleic acid and a host for the vector A 
10 purified antibody which specifically binds the P/CAF protein is also provided. 



In addition, also provided is a bioassay for screening substances for the ability to 
inhibit the transcription modulating activity of P/CAF and/or hist one acetyltransferase 
activity, comprising contacting the substance with a system in which histone acetylation 

15 by P/CAF can be determined; determining the amount of histone acetylation by P/CAF 
in the presence of the substance; and comparing the amount of histone acetylation by 
P/CAF in the presence of the substance with the amount of histone acetylation by 
P/CAF in the absence of the substance, a decreased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit the 

20 transcription modulating activity and/or histone acetyltransferase activity of P/CAF. 



Furthermore, the present invention provides a bioassay for screening substances 
for the ability to inhibit the transcription modulating activity and/or histone 
acetyltransferase activity of P/CAF comprising contacting the substance with a system in 

25 which the p300 binding of P/CAF can be determined; determining the amount of p300 
binding of P/CAF in the presence of the substance; and comparing the amount of p300 
binding of P/CAF in the presence of the substance with the amount of p300 binding of 
P/CAF in the absence of the substance, a decreased amount of p300 binding of P/CAF in 
the presence of the substance indicating a substance that can inhibit the transcription 

30 modulating activity and/or histone acetyltransferase activity of P/CAF. 
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Also provided is a method for determining the amount of P/CAF in a biological 
sample comprising contacting the biological sample with a polypeptide comprising the 
amino acid sequence of SEQ ID NO:3 under conditions whereby a P/CAF/p300 
complex can be formed; and determining the amount of the P/CAF/p300 complex, the 
5 amount of the complex indicating the amount of P/CAF in the sample 

The present invention additionally provides a method for determining the amount 
of P/CAF in a biological sample comprising contacting the biological sample with an 
antibody which specifically binds P/CAF under conditions whereby a P/C AF/antibody 
10 complex can be formed; and determining the amount of the P/C AF/antibody complex, 
the amount of the complex indicating the amount of P/CAF in the sample. 

Also provided herein is an assay for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of P/CAF, comprising: contacting the 

1 5 substance with a system in which histone acetylation by P/CAF can be determined; 
determining the amount of histone acetylation by P/CAF in the presence of the 
substance; and comparing the amount of histone acetylation by P/CAF in the presence of 
the substance with the amount of histone acetylation by P/CAF in the absence of the 
substance, a decreased or increased amount of histone acetylation by P/CAF in the 

20 presence of the substance indicating a substance that can inhibit or stimulate, 
respectively, the histone acetyltransferase activity of P/CAF. 

The present invention further provides an assay for screening substances for the 
ability to inhibit binding of P/CAF to p300/CBP comprising: contacting the substance 

25 with a system in which the P/CAF binding of P300/CBP can be determined; determining 
the amount of P/CAF binding of p300/CBP in the presence of the substance, and 
comparing the amount of binding of P/CAF to p300/CBP in the presence of the 
substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 

30 substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 
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In addition, an assay is provided for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of p300/CBP, comprising: contacting 
the substance with a system in which histone acetylation by p300/CBP can be 
5 determined; determining the amount of histone acetylation by p300/CBP in the presence 
of the substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
10 stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

Furthermore, the present invention provides an assay for screening substances 
for the ability to inhibit binding of a DNA-binding transcription factor to p300/CBP 
comprising: contacting the substance with a system in which the DNA-binding 

15 transcription factor binding of P300/CBP can be determined, determining the amount of 
DNA-binding transcription factor binding of p300/CBP in the presence of the substance; 
and comparing the amount of binding of DNA-binding transcription factor to p300/CBP 
in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 

20 binding of DNA-binding transcription factor to p300/CBP in the presence of the 

substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

A method is also provided for inhibiting the transcription modulating activity of 
25 P/CAF in a subject, comprising administering to the subject a transcription modulating 
activity inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 

Also provided in the present invention is a method for stimulating the 
transcription modulating activity of P/CAF in a subject, comprising administering to the 
30 subject a transcription modulating activity stimulating amount of a substance in a 
pharmaceutically acceptable carrier. 
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Furthermore, the present invention provides a method for inhibiting the histone 
acetyltransferase activity of p300/CBP in a subject, comprising administering to the 
subject a histone acetyltransferase activity inhibiting amount of a substance in a 
5 pharmaceutical^ acceptable carrier. 

Finally, the present invention additionally provides a method for stimulating the 
histone acetyltransferase activity of p300/CBP in a subject, comprising administering to 
the subject a histone acetyltransferase activity stimulating amount of a substance in a 
10 pharmaceutically acceptable carrier. 



BRIEF DESCRIPTION OF THE FIGURES 



Figs. 1A-B. Fig 1 A: P/CAF-p300/CBP interaction in vivo. Cell extract was 
15 immunoprecipitated with rabbit anti-P/CAF (lanes 1, 4, and 7), rabbit anti-CBP (lanes 2 
and 5), and mouse anti-p300 (lane 9) antibodies. For controls, cell extract was 
precipitated with rabbit control IgG (lanes 3, 6, and 8) or mouse anti-HA monoclonal 
antibody (lane 10). The precipitates were analyzed by immunoblotting with anti-P/CAF 
(lanes 1-3), anti-CBP (lanes 4-6), and anti-p300 (lanes 7-10) antibodies. The positions 
20 of non-specific bands are indicated by asterisks. Fig. IB: E1A inhibits the P/CAF-p300 
interaction in vivo. Osteosarcoma cells were transfected with either control vector 
(lanes 1 and 4) or El A- (lanes 2 and 5) or El AAN- (lanes 3 and 6) expression vectors. 
Extract from the transfected subpopulation was immunoprecipitated with anti-P/CAF 
(lanes 1-3) or control (lanes 4-6) IgG. The precipitates were analyzed by 
25 immunoblotting with anti-p300 and anti-P/CAF. 

Figs. 2A-F. P/CAF and El A mediate antagonistic effects on cell cycle 
progression. HeLa cells (ATCC accession number CCL 2) were transfected by 
electroporation with 7 ^g of P/CAF-expression plasmid and/or 3 of the full-length or 
30 the N-terminally deleted (A2-36) El A 12S-expression plasmid as indicated in the figure. 
These plasmids were constructed by subcloning FLAG-P/CAF and El A cDNAs into 
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pCX (34) and pcDNAI (Invitrogen), respectively. All samples, in addition, contained 1 
tu,g of sorting plasmid (pCMV-IL2R) (31) and carrier plasmid (pCX) to normalize the 
total amount of DNA to 1 1 //g. After transfection, cells were incubated in Dulbecco's 
modified Eagle's medium with 10% fetal bovine calf serum for 12 hours and 
5 subsequently labeled in medium containing 10 //M bromo-deoxyuridine (BrdU) for 30 
min. Subsequently, the transfected subpopulation was purified by magnetic affinity cell 
sorting and nuclei were analyzed by dual parameter flow cytometry as described (32) 
Histograms show percentages of cells in Gl and S phases. Abscissa values represent 
fluorescence intensity of bound anti-BrdU antibodies in log scale. 

10 

Fig. 3. Histone acetyltransferase activity of P/CAF. Activity of hGCN5 (lanes 1 
and 4) and P/CAF (lanes 2 and 5) that acetylates free histones (lanes 1-3) or histones in 
the nucleosome core particle (35) (lanes 4-6) was measured as described (36). Each 
reaction contains 03 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol 
1 5 of the histone octamer or the nucleosome core particle and 10 pmol of [l- l4 C]acetyl- 
CoA. Note that the histone octamer dissociates into dimers or tetramers under assay 
conditions. Acetylated histones were detected by autoradiography after separation by 
SDS-PAGE. The bands corresponding to acetylated histones H3 and H4 are indicated 
by arrows. 

20 

DETAILED DESCRIPTION OF THE INVENTION 

As used in the specification and in the claims, "a" can mean one or more, 
depending upon the context in which it is used. 

25 

P/CAF protein and fragments 

The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
30 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones. The P/CAF protein can also bind to the amino acid region of SEQ ID NO:3 
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(amino acid (aa) residues 1753 - 1966) of the cellular transcriptional factor, p300 (which 
has the complete amino acid sequence of SEQ ID NO:6 and the nucleotide sequence of 
SEQ ID NO: 12), and the amino acid region of SEQ ID NO: 6 (amino acid residues 1805 
- 1854) of the cellular transcriptional factor, CBP (which has the complete amino acid 
5 sequence of SEQ ID NO: 7 and the nucleotide sequence of SEQ ID NO: 13) The 
P/CAF protein can be defined by any one or more of the typically used parameters 
Examples of these parameters include, but are not limited to molecular weight 
(calculated or empirically determined), isoelectric focusing point, specific epitope(s), 
complete amino acid sequence, sequence of a specific region (e.g., N-terminus) of the 
10 amino acid sequence and the like. 

For example, The P/CAF protein can consist of the amino acid sequence of SEQ 
ID NO: 1 or the P/CAF protein can comprise the amino acid sequence of SEQ ID NO: 2 
which represents the carboxy terminal end of the P/CAF protein and contains the histone 
15 acetyltransferase activity, or the amino acid sequence of SEQ ID NO:4, which 

represents the amino terminal end of the P/CAF protein, containing the binding site for 
p300/CBP. Because the amino-terminal region is specific for P/CAF it can be used to 
define and identify P/CAF. 

20 As used herein, "purified" refers to a protein (polypeptide, peptide, etc.) that is 

sufficiently free of contaminants or cell components with which it normally occurs to 
distinguish it from the contaminants or other components of its natural environment. 
The purified protein need not be homogeneous, but must be sufficiently free of 
contaminants to be useful in a clinical or research setting, for example, in an assay for 

25 detecting antibodies to the protein. Greater levels of purity can be obtained using 
methods derived from well known protocols. Specific methods for purifying P/CAF 
proteins are known in the art. 

As will be appreciated by those skilled in the art, the invention also includes 
30 those P/CAF polypeptides having slight variations in amino acid sequence which yield 
polypeptides equivalent to the P/CAF protein defined herein. Such variations may arise 
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naturally as allelic variations (e.g., due to genetic polymorphism) or may be produced by 
human intervention (e.g., by mutagenesis of cloned DNA sequences), such as induced 
point, deletion, insertion and substitution mutants. Minor changes in amino acid 
sequence are generally preferred, such as conservative amino acid replacements, small 
5 internal deletions or insertions, and additions or deletions at the ends of the molecules. 
Substitutions may be designed based on, for example, the model of DayhofF, et al (37). 
These modifications can result in changes in the amino acid sequence, provide silent 
mutations, modify a restriction site, or provide other specific mutations. 

Modifications to any of the P/CAF proteins or fragments can be made, while 
preserving the specificity and activity (function) of the native protein or fragment 
thereof. As used herein, "native" describes a protein that occurs in nature. The 
modifications contemplated herein can be conservative amino acid substitutions, for 
example, the substitution of a basic amino acid for a different basic amino acid. 
Modifications can also include creation of fusion proteins with epitope tags or known 
recombinant proteins or genes encoding them created by subcloning into commercial or 
non-commercial vectors (e.g., polyhistidine tags, flag tags, myc tag, glutathione- S- 
transferase [GST] fusion protein, xylE fusion reporter construct). Furthermore, the 
modifications can be such as do not affect the function of the protein or the way the 
protein accomplishes that function (e.g., its secondary structure or the ultimate result of 
the protein ! s activity). These products are equivalent to the P/CAF protein. The means 
for determining the function, way and result parameters are well known. 

Having provided an example of a purified P/CAF protein, the invention also 
25 enables the purification of P/CAF homologs from other species and allelic variants from 
individuals within a species. For example, an antibody raised against the exemplary 
human P/CAF protein can be used routinely to screen preparations from different 
humans for allelic variants of the P/CAF protein that react with the P/CAF protein- 
specific antibody. Similarly, an antibody raised against an epitope, for example, from a 
30 conserved amino acid region of the human P/CAF protein can be used to routinely 
screen for homologs of the P/CAF protein in other species. A P/CAF protein can be 



10 



15 



20 



WO 98/03652 PCT/US97/12877 

10 

routinely identified in and obtained from other species and from individuals within a 
species using the methods taught herein and others known in the art. For example, 
given the present sequence, the DNA encoding a conserved amino acid sequence can be 
used to probe genomic DNA or DNA libraries of an organism to predictably obtain the 
5 P/CAF gene for that organism. The gene can then be cloned and expressed as the 
P/CAF protein and purified according to any of a number of routine, predictable 
methods. An example of the routine protein purification methods available in the art can 
be found in Pei et al. (38). 

10 A purified polypeptide fragment of the P/CAF protein is also provided. The 

term "fragment" as used herein regarding a P/CAF protein, means a molecule of at least 
five contiguous amino acids of P/CAF protein that has at least one function shared by 
P/CAF protein or a region thereof. These functions can include antigenicity, binding 
capacity, acetyltransferase activity and structural roles, among others. The P/CAF 

1 5 fragment can be specific for a recited source. As used herein to describe an amino acid 
sequence (protein, polypeptide, peptide, etc.), "specific" means that the amino acid 
sequence is not found identically in any other source. The determination of specificity is 
made routine by the availability of computerized amino acid sequence databases and 
sequence comparison programs, wherein an amino acid sequence of almost any length 

20 can be quickly and reliably checked for the existence of identical sequences. If an 
identical sequence is not found, the protein is "specific" for the recited source. For 
example, a P/CAF fragment can be species- specific (e.g., found in the P/CAF protein of 
humans, but not of other species). 

25 A fragment of the P/CAF protein having histone acetyltransferase activity can 

consist of the amino acid sequence of SEQ ID NO:2. A fragment of the P/CAF protein 
which binds to the amino acid sequence of SEQ ID NO:3 on p300 and the amino acid 
sequence of SEQ ID NO:9 on CBP can consist of the amino acid sequence of SEQ ID 
NO: 4. To the extent that these fragments are specific for P/CAF, they can be used to 

30 identify and define P/CAF . 
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An antigenic fragment of P/CAF protein is provided. An antigenic fragment has 
an amino acid sequence of at least about five consecutive amino acids of a P/CAF 
protein amino acid sequence and binds an antibody or elicits an immune response in an 
animal. An antigenic fragment can be selected by applying the routine technique of 
5 epitope mapping to P/CAF protein to determine the regions of the proteins that contain 
epitopes reactive with antibodies or are capable of eliciting an immune response in an 
animal. Once the epitope is selected, an antigenic polypeptide containing the epitope 
can be synthesized directly, or produced recombinantly by cloning nucleic acids 
encoding the antigenic polypeptide in an expression system, according to standard 
10 methods. 

Alternatively, an antigenic fragment of the antigen can be isolated from the 
whole P/CAF protein or a larger fragment of the P/CAF protein by chemical or 
mechanical disruption. Fragments can also be randomly chosen from a known P/CAF 
1 5 protein sequence and synthesized. The purified fragments thus obtained can be tested to 
determine their antigenicity and specificity by routine methods. 

Nucleic Acids Encoding P/CAF Protein 

An isolated nucleic acid that encodes a P/CAF protein is also provided. As used 

20 herein, the term "isolated" means a nucleic acid separated or substantially free from at 
least some of the other components of the naturally occurring organism, for example, 
the cell structural components commonly found associated with nucleic acids in a 
cellular environment and/or other nucleic acids. The isolation of nucleic acids can 
therefore be accomplished by techniques such as cell lysis followed by phenol plus 

25 chloroform extraction, followed by ethanol precipitation of the nucleic acids (39). It is 
not contemplated that the isolated nucleic acids are necessarily totally free of all non- 
nucleic acid components or all other nucleic acids, but that the isolated nucleic acids are 
isolated to a degree of purification to be useful in clinical, diagnostic, experimental, or 
other procedures such as, for example, gel electrophoresis, Southern, Northern or dot 

30 blot hybridization, or polymerase chain reaction (PCR). 
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A skilled artisan in the field will readily appreciate that there are a multitude of 
procedures which may be used to isolate the nucleic acids prior to their use in other 
procedures. These include, but are not limited to, lysis of the cell followed by gel 
filtration or anion exchange chromatography, binding DNA to silica in the form of glass 
5 beads, filters or diatoms in the presence of high concentrations of chaotropic salts, or 
ethanol precipitation of the nucleic acids 

The nucleic acids of the present invention can include positive and negative 
strand RNA as well as DNA and can include genomic and subgenomic nucleic acids 

10 found in the naturally occurring organism The nucleic acids contemplated by the 
present invention include double stranded and single stranded DNA of the genome, 
complementary positive stranded cRNA and mRNA, and complementary cDNA 
produced therefrom and any nucleic acid which can selectively or specifically hybridize 
to the isolated nucleic acids provided herein. Stringent conditions (further described 

15 below) are used to distinguish selectively or specifically hybridizing nucleic acids from 
non-selectively and non-specifically hybridizing nucleic acids. 

An isolated nucleic acid that encodes a P/CAF protein can be species-specific 
(i.e., does not encode the P/CAF protein of other species and does not occur in other 
20 species). Examples of the nucleic acids contemplated herein include the nucleic acid of 
SEQ ID NO: 10 as well as the nucleic acids that encode each of the P/CAF proteins or 
fragments thereof described herein. P/CAF proteins and protein fragments can be 
routinely obtained as described herein and their structure (sequence) determined by 
routine means including the methods as used herein 

25 

P/CAF protein-encoding nucleic acids can be isolated from an organism in which 
they are normally found (e.g., humans), using any of the routine techniques. For 
example, a genomic DNA or cDNA library can be constructed and screened for the 
presence of the nucleic acid of interest using one of the present P/CAF protein-encoding 
30 nucleic acids as a probe. Methods of constructing and screening such libraries are well 
known in the art and kits for performing the construction and screening steps are 



WO 98/03652 



PCTYUS97/12877 



13 

commercially available (for example, Stratagene Cloning Systems, La Jolla, CA). Once 
isolated, the nucleic acid can be directly cloned into an appropriate vector, or if 
necessary, be modified to facilitate the subsequent cloning steps. Such modification 
steps are routine, an example of which is the addition of oligonucleotide linkers, which 
5 contain restriction sites, to the termini of the nucleic acid (See, for example, ref 39). 



P/CAF protein-encoding nucleic acids can also be synthesized. For example, a 
method of obtaining a DNA molecule encoding a specific P/CAF protein is to synthesize 
a recombinant DNA molecule which encodes the P/CAF protein. For example, nucleic 

10 acid synthesis procedures are routine in the art and oligonucleotides coding for a 

particular protein region are readily obtainable through automated DNA synthesis. A 
nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 

15 5 ! or 3' overhangs at the termini for cloning into an appropriate vector. 



Oligonucleotides complementary to or identical with the P/CAF protein- 
encoding nucleic acid sequence can be synthesized as primers for amplification 
reactions, such as PGR, or as probes to detect P/CAF protein encoding nucleic acids by 
20 various hybridization protocols (e.g., Northern blot; Southern blot; dot blot, colony 
screening, etc.). 



Double-stranded molecules coding for relatively large proteins can readily be 
synthesized by first constructing several different double-stranded molecules that code 

25 for particular regions of the protein, followed by ligating these DNA molecules together 
For example, Cunningham, et al. (40), have constructed a synthetic gene encoding the 
human growth hormone by first constructing overlapping and complementary synthetic 
oligonucleotides and ligating these fragments together. See also, Ferretti, et al. (41), 
wherein synthesis of a 1057 base pair synthetic bovine rhodopsin gene from synthetic 

30 oligonucleotides is disclosed. 
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By constructing a P/CAF protein-encoding nucleic acid in this manner, one 
skilled in the art can readily obtain any particular P/CAF protein with modifications at 
any particular position or positions. See also, U.S. Patent No. 5,503,995 which 
describes an enzyme template reaction method of making synthetic genes Techniques 
5 such as this are routine in the art and are well documented. DNA encoding the P/CAF 
protein or P/CAF protein fragments can then be expressed in vivo or in vitro. 

The nucleic acid encoding the P/CAF protein can be any nucleic acid that 
functionally encodes the P/CAF protein. To functionally encode the protein (i.e., allow 

10 the nucleic acid to be expressed), the nucleic acid can include, but is not limited to, 
expression control sequences, such as an origin of replication, a promoter, regions 
upstream or downstream of the promoter, such as enhancers that may regulate the 
transcriptional activity of the promoter, appropriate restriction sites to facilitate cloning 
of inserts adjacent to the promoter, antibiotic resistance genes or other markers which 

1 5 can serve to select for cells containing the vector or the vector containing the insert, and 
necessary information processing sites, such as ribosome binding sites, RNA splice sites, 
polyadenylation sites and transcription termination sequences as well as any other 
sequence which may facilitate the expression of the inserted nucleic acid 

20 Preferred expression control sequences are promoters derived from 

metallothionine genes, actin genes, immunoglobulin genes, CMV, SV40, adenovirus, 
bovine papilloma virus, etc. A nucleic acid encoding a P/CAF protein can readily be 
determined based upon the genetic code for the amino acid sequence of the P/CAF 
protein and many nucleic acid sequences will encode a P/CAF protein Modifications in 

25 the nucleic acid sequence encoding the P/CAF protein are also contemplated. 
Modifications that can be useful are modifications to the sequences controlling 
expression of the P/CAF protein to make production of P/CAF protein inducible or 
repressible as controlled by the appropriate inducer or repressor. Such means are 
standard in the art (see, e.g., ref 39). The nucleic acids can be generated by means 

30 standard in the art, such as by recombinant nucleic acid techniques, as exemplified in the 
examples herein, and by synthetic nucleic acid synthesis or in vitro enzymatic synthesis 
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After a nucleic acid encoding a particular P/C AF protein of interest, or a region 
of that nucleic acid, is constructed, modified, or isolated, that nucleic acid can then be 
cloned into an appropriate vector, which can direct the in vivo or in vitro synthesis of 
that wild-type and/or modified P/CAF protein. The vector is contemplated to have the 
5 necessary functional elements that direct and regulate transcription of the inserted 
nucleic acid, as described above. The vector containing the P/CAF nucleic acid or 
nucleic acid fragment can be in a host (e.g., cell or transgenic animal) for expressing the 
nucleic acid. The P/CAF protein or fragment thereof can thus be produced in a host 
system containing the expression vector and its functional activity as described herein 
1 0 can be demonstrated according to methods well known in the art. 

There are numerous £. coli {Escherichia coli) expression vectors known to one 
of ordinary skill in the art useful for the expression of proteins. Other microbial hosts 
suitable for use include bacilli, such as Bacillus subtilis, and other enterobacteria, such 

15 as Salmonella, Serratia, as well as various Pseudomonas species. These prokaryotic 
hosts can support expression vectors which will typically contain expression control 
sequences compatible with the host cell (e.g., an origin of replication). In addition, any 
number of a variety of well-known promoters will be present, such as the lactose 
promoter system, a tryptophan (Trp) promoter system, a beta-lactamase promoter 

20 system, or a promoter system from phage lambda. The promoters will typically control 
expression, optionally with an operator sequence and have ribosome binding site 
sequences, for example, for initiating and completing transcription and translation. If 
necessary, an amino terminal methionine can be provided by insertion of a Met codon 5' 
and in-frame with the gene sequence. Also, the carboxy-terminal extension of the 

25 protein can be removed using standard oligonucleotide mutagenesis procedures. 

Additionally, yeast expression can be used. There are several advantages to 
yeast expression systems. First, evidence exists that proteins produced in yeast secretion 
systems exhibit correct disulfide pairing. Second, post-translational glycosylation is 
30 efficiently carried out by yeast secretory systems. The Saccharomyces cerevisiae pre- 
pro-alpha-factor leader region (encoded by the MFa-1 gene) is routinely used to direct 
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protein secretion from yeast (42). The leader region of pre-pro-alpha-factor contains a 
signal peptide and a pro-segment which includes a recognition sequence for a yeast 
protease encoded by the KEX2 gene. This enzyme cleaves the precursor protein on the 
carboxyl side of a Lys-Arg dipeptide cleavage-signal sequence. The polypeptide coding 

5 sequence can be fused in-frame to the pre-pro-alpha-factor leader region. This construct 
is then put under the control of a strong transcription promoter, such as the alcohol 
dehydrogenase I promoter or a glycolytic promoter The protein coding sequence is 
followed by a translation termination codon which is followed by transcription 
termination signals. Alternatively, the polypeptide encoding sequence of interest can be 

10 fused to a second protein coding sequence, such as Sj26 or P-galactosidase, used to 
facilitate purification of the resultant fusion protein by affinity chromatography. The 
insertion of protease cleavage sites to separate the components of the fusion protein is 
applicable to constructs used for expression in yeast. 

1 5 Efficient post-translational glycosylation and expression of recombinant proteins 

can also be achieved in Baculovirus expression systems in insect cells. 

Mammalian cells permit the expression of proteins in an environment that favors 
important post-translational modifications such as folding and cysteine pairing, addition 

20 of complex carbohydrate structures and secretion of active protein. Vectors useful for 
the expression of proteins in mammalian cells are characterized by insertion of the 
protein encoding sequence between a strong viral promoter and a polyadenylation 
signal. The vectors can contain genes conferring either gentamicin or methotrexate 
resistance for use as selectable markers. For example, the antigen and immunoreactive 

25 fragment coding sequence can be introduced into a Chinese hamster ovary (CHO) cell 
line using a methotrexate resistance-encoding vector. Presence of the vector RNA in 
transformed cells can be confirmed by Northern blot analysis and production of a cDNA 
or opposite strand RNA corresponding to the protein encoding sequence can be 
confirmed by Southern and Northern blot analysis, respectively A number of other 

30 suitable host cell lines capable of secreting intact proteins have been developed in the art 
and include the CHO cell lines, HeLa cells, myeloma cell lines, Jurkat ceils, and the like. 
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Expression vectors for these cells can include expression control sequences, as described 
above. The vectors containing the nucleic acid sequences of interest can be transferred 
into the host cell by well-known methods, which vary depending on the type of cell host. 
For example, calcium chloride transfection is commonly utilized for prokaryotic cells, 
5 whereas calcium phosphate treatment or electroporation may be used for other cell 
hosts. 

Alternative vectors for the expression of protein in mammalian cells, similar to 
those developed for the expression of human gamma-interferon, tissue plasminogen 
10 activator, clotting Factor VIII, hepatitis B virus surface antigen, protease Nexin 1, and 
eosinophil major basic protein, can be employed. Further, the vector can include CMV 
promoter sequences and a polyadenylation signal available for expression of inserted 
nucleic acid in mammalian cells (such as COS7). 

1 5 The nucleic acid sequences can be expressed in hosts after the sequences have 

been positioned to ensure the functioning of an expression control sequence. These 
expression vectors are typically replicable in the host organisms either as episomes or as 
an integral part of the host chromosomal DNA. Commonly, expression vectors can 
contain selection markers, e.g., tetracycline resistance or hygromycin resistance, to 

20 permit detection and/or selection of those cells transformed with the desired nucleic acid 
sequences (see, e.g., U.S. Patent 4,704,362). 

The nucleic acids produced as described above can also be expressed in a host 
which is a non-human animal to create a transgenic animal, containing, in a germ or 

25 somatic cell, a nucleic acid comprising the coding sequence for all or a portion of the 
P/CAF protein, as well as all of the other regulatory elements required for expression of 
the P/CAF protein-encoding sequence. The animal will express the P/CAF gene or 
portion thereof to produce the P/CAF protein or protein fragment and such expression 
can be detected by determination of a particular phenotype unique to the transgenic 

30 animal expressing the transferred nucleic acid. 
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The nucleic acid can be the nucleic acid of SEQ ID NO: 10, a nucleic acid having 
a nucleotide sequence which encodes the P/CAF protein, a nucleic acid having a 
nucleotide sequence which encodes the protein of SEQ ID NO; 1, as well as the nucleic 
acids that encode the proteins comprising the fragments of SEQ ID NOS 2 and 4. 

5 

The nucleic acids of the invention can contain substitutions or deletions which 
provide a particular phenotype of interest. For example, various deletions or base 
substitutions can be introduced into the nucleic acid encoding the P/CAF protein for the 
purpose of studying the effects of these particular deletions or substitutions on the 

10 transcription modulation activity of the P/CAF protein. These effects can be monitored 
by observation of such characteristics as growth and development of the animal, the 
ability to develop tumors, survival rates and the like. The gene construct introduced 
into the animal cells to produce the transgenic animal can contain any of the regulatory 
elements described above to modulate expression of the foreign genes. As used herein, 

15 the term "phenotype" includes morphology, biochemical profiles, changes in tumor 
formation and other parameters that are affected by the presence of the P/CAF protein 

The transgenic animals of the invention can also be used in a method for 
determining the effectiveness of administering a nucleic acid encoding a functional 

20 P/CAF protein to a subject in need of a functional P/CAF protein. First, a nucleic acid 
encoding a nonfunctional P/CAF protein can be introduced into the animal's cells and 
expressed to yield a characteristic phenotype Then, using standard gene therapy 
techniques, a nucleic acid encoding a functional P/CAF protein can be introduced into 
the animal's cells and the effects on the animal's phenotypic characteristics can be 

25 determined. 

Having provided and taught how to obtain a nucleic acid that encodes a P/CAF 
protein, an isolated nucleic acid that encodes a fragment of P/CAF protein is also 
provided. The nucleic acid encoding the fragment can be obtained using any of the 
30 methods applicable to the nucleic acid encoding the entire P/CAF protein. The nucleic 
acid fragment can encode a species-specific P/CAF protein fragment (e.g., found in the 
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P/CAF protein of humans, but not in the P/CAF proteins of other species). Nucleic 
acids encoding species-specific fragments of P/CAF protein are themselves species- 
specific or allele-specific fragments of the P/CAF gene 



5 Examples of fragments of a nucleic acid encoding a fragment of the P/CAF 

protein can include the nucleic acid sequences which encode the amino acid sequences 
of the fragments of SEQ ID NOS:2 or 4. The same routine computer analyses used to 
select these examples of fragments can be routinely used to obtain others. Fragments of 
P/CAF-encoding nucleic acids can be primers for PCR or probes, which can be species- 
10 specific, gene-specific or allele-specific. P/CAF-encoding nucleic acid fragments can 
encode antigenic or immunogenic fragments of P/CAF protein that can be used in 
therapeutic assays or screening protocols. P/CAF gene fragments can encode fragments 
of P/CAF protein having histone acetylase activity and/or p300/CBP binding activity as 
described above, as well as other uses that may become apparent, 

15 

An isolated nucleic acid of at least ten nucleotides that selectively hybridizes with 
the nucleic acid of SEQ ID NO: 10 under selected conditions is provided. For example, 
the conditions can be PCR amplification conditions and the hybridizing nucleic acid can 
be a primer consisting of a specific fragment of the reference sequence or a nearly 
20 identical nucleic acid that hybridizes only to the exemplified P/CAF-encoding nucleic 
acid or allelic variants thereof 



The invention provides an isolated nucleic acid that selectively hybridizes with 
the P/CAF-encoding nucleic acid sequence of SEQ ID NO: 10 under stringent 

25 conditions. The hybridizing nucleic acid can be a probe that hybridizes only to the 

exemplified P/CAF-encoding nucleic acid sequence. Thus, the hybridizing nucleic acid 
can be a naturally occurring species-specific allelic variant of the exemplified P/CAF 
gene. The hybridizing nucleic acid can also include insubstantial base substitutions that 
do not prevent hybridization under the stated stringent conditions or affect either the 

30 function of the encoded protein, the way the protein accomplishes that function (e.g., its 
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secondary structure) or the ultimate result of the protein's activity. The means for 
determining these parameters are well known. 



As used herein to describe nucleic acids, the term "selectively hybridizes" 
5 excludes the occasional randomly hybridizing nucleic acids as well as nucleic acids that 
encode other known homologs of the P/CAF protein. The selectively hybridizing 
nucleic acids of the invention can have at least 70%, 73%, 78%, 80%, 85%, 88%, 90%, 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% complementarity with the 
segment and strand of the sequence to which it hybridizes This list is not intended to 
10 exclude percent complementarity values between these values. The nucleic acids can be 
at least 10, 15, 16, 17, 18, 20, 21, 23, 24, 25, 30, 35, 40, 50, 100, 150, 200, 300, 500, 
550, 750, 900, 950, or 1000 nucleotides in length or any intervening length, depending 
on whether the nucleic acid is to be used as a primer, probe or for protein expression. 
The hybridizing nucleic acid can comprise a region of at least ten nucleotides (up to full 
1 5 length) that is completely complementary to a unique region of the nucleic acid to which 
it hybridizes. 

The nucleic acid can be an alternative coding sequence for the P/CAF protein, or 
can be used as a probe or primer for detecting the presence of or obtaining the P/CAF 
20 protein. If used as primers, the invention provides compositions including at least two 
nucleic acids which selectively hybridize with different regions of the nucleic acid so as 
to amplify a desired region. Depending on the length of the probe or primer, it can 
range between 70% complementary bases and full complementarity and still hybridize 
under stringent conditions. 

25 

For example, for the purpose of obtaining or determining the presence of a 
nucleic acid encoding the P/CAF protein, the degree of complementarity between the 
hybridizing nucleic acid (probe or primer) and the sequence to which it hybridizes 
(P/CAF DNA in a sample) should be at least enough to exclude hybridization with a 
30 nucleic acid from another species. The invention provides examples of these nucleic 

acids of P/CAF, so that the degree of complementarity required to distinguish selectively 
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hybridizing from nonselectively hybridizing nucleic acids under stringent conditions can 
be clearly determined for each nucleic acid. It should also be clear that the hybridizing 
nucleic acids of the invention will not hybridize with nucleic acids encoding unrelated 
proteins (hybridization is selective) under stringent conditions. 

5 

"Stringent conditions" refers to the washing conditions used in a hybridization 
protocol. In general, the washing conditions should be a combination of temperature 
and salt concentration chosen so that the denaturation temperature is approximately 5- 
20 °C below the calculated T m of the nucleic acid hybrid under study. The temperature 

10 and salt conditions are readily determined empirically in preliminary experiments in 
which samples of reference DNA immobilized on filters are hybridized to the probe or 
protein encoding nucleic acid of interest and then washed under conditions of different 
stringencies. For example, the nucleic acid sequence of SEQ ID NO: 10 was used as a 
specific radiolabeled probe for the detection of messenger RNA transcribed from the 

1 5 P/CAF gene by performing hybridizations under stringent conditions. The T m of such an 
oligonucleotide can be estimated by allowing 2°C for each A or T nucleotide, and 4°C 
for each G or C. For example, an 18 nucleotide probe of 50% G+C would, therefore, 
have an approximate T m of 54° C. 



20 The invention provides an isolated nucleic acid that selectively hybridizes with 

the P/CAF gene shown in the sequence set forth as SEQ ID NO: 10 under stringent 
conditions. The invention further provides an isolated nucleic acid complementary to 
the nucleotide sequence set forth in SEQ ID NO: 10. 



25 Antibodies to the P/CAF protein 

A purified antibody and an antiserum containing polyclonal antibodies that 
specifically bind the P/CAF protein or antigenic fragment are also provided. The term 
"bind" means the well understood antigen/antibody binding as well as other nonrandom 
association with an antigen. "Specifically bind" as used herein describes an antibody or 

30 other ligand that does not cross react substantially with any antigen other than the one 
specified, in this case, an antigen of the P/CAF protein. Antibodies can be made as 
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described in Harlow and Lane (33). Briefly, purified P/CAF protein or an antigenic 
fragment thereof can be injected into an animal in an amount and in intervals sufficient to 
elicit a humoral immune response. Serum polyclonal antibodies can be purified directly, 
or spleen cells from the animal can be fused with an immortal cell line and screened for 
5 monoclonal antibody secretion, according to procedures well known in the art. Purified 
monospecific polyclonal antibodies that specifically bind the P/CAF antigen are also 
within the scope of the present invention The antibodies of the present invention can 
bind the protein of claim 1, the protein of claim 2, the protein of claim 3 and/or the 
protein of claim 4, as well as any other proteins of the present invention. 

10 

A ligand that specifically binds the antigen is also contemplated. The iigand can 
be a fragment of an antibody, such as , for example, an Fab fragment which retains 
P/CAF binding activity, or a smaller molecule designed to bind an epitope of the P/CAF 
antigen. The antibody or ligand can be bound to a substrate or labeled with a detectable 
15 moiety or both bound and labeled. The detectable moieties contemplated within the 

compositions of the present invention include those listed above in the description of the 
diagnostic methods, including fluorescent, enzymatic and radioactive markers 

The antibody can be bound to a solid support substrate or conjugated with a 
20 detectable moiety or therapeutic compound or both bound and conjugated. Such 
conjugation techniques are well known in the art. For example, conjugation of 
fluorescent, radioactive or enzymatic moieties can be performed as described in the art 
(33,43). The detectable moieties contemplated in the present invention can include 
fluorescent, radioactive and enzymatic markers and the like. Therapeutic drugs 
25 contemplated with the present invention can include cytotoxic moieties such as ricin A 
chain, diphtheria toxin, pseudomonas exotoxin and other chemotherapeutic compounds 

It is well understood by one of skill in the art that all of the above discussion 
regarding antibodies to P/CAF can also be applied with regard to production, 
30 characterization and use of antibodies which bind the p300/CBP protein or any of the 
DNA-binding transcription factors of this invention. 
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Measuring the P/CAF protein in a sample 

The present invention also provides a method for determining the presence and 
thus the amount of P/CAF protein in a biological sample. As used herein, a biological 
sample includes any tissue or cell which would contain the P/CAF protein. Examples of 
5 cells include tissues taken from surgical biopsies, isolated from a body fluid or prepared 
in an in vitro tissue culture environment. 

One example of determining the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 
sequence of SEQ ID NO:3 under conditions whereby a P/CAF/p300 complex can be 
formed; and determining the amount of the P/CAF/p300 complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/p300 complex can be accomplished through techniques standard in the art. 
For example, the complex may be precipitated out of a solution and detected by the 
addition of a detectable moiety conjugated to the p300 protein or by the detection of an 
antibody which binds p300 or the P/CAF protein, as taught in the Examples herein. 
Antibodies which bind p300 or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of 
P/CAF/p300 complexes by the detection of the binding of antibodies reactive with p300 
or the P/CAF protein can be accomplished using various immunoassays as are available 
in the art, as described below. 

Alternatively, determination of the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 
25 sequence of SEQ ID NO: 9 under conditions whereby a P/CAF/CBP complex can be 
formed; and determining the amount of the P/CAF/CBP complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/CBP complex can be accomplished through techniques standard in the art. 
For example, the complex may be precipitated out of a solution and detected by the 
30 addition of a detectable moiety conjugated to the CBP protein or by the detection of an 
antibody which binds either CBP or the P/CAF protein, as taught in the Examples 
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herein. Antibodies which bind CBP or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of P/CAF/CBP 
complexes by the detection of the binding of antibodies reactive with CBP or the P/CAF 
protein can be accomplished using various immunoassays as are available in the art, as 
5 described below. 

Another example of determining the amount of P/CAF in a biological sample 
comprises contacting the biological sample with an antibody which specifically binds 
P/CAF under conditions whereby a P/CAF/ antibody complex can be formed and 
10 determining the amount of the P/CAF/antibody complex, the amount of the complex 
indicating the amount of P/CAF in the sample. Antibodies which bind P/CAF can be 
either monoclonal or polyclonal antibodies and can be obtained as described herein 
Determination of P/CAF/antibody complexes can be accomplished using various 
immunoassays as are available in the art, as described below. 

15 

Immunoassays such as immunofluorescence assays, radioimmunoassays (RIA), 
immunoblotting and enzyme linked immunosorbent assays (ELISA) can be readily 
adapted for detection and measurement of P/CAF in a biological sample. Both 
polyclonal and monoclonal antibodies can be used in the assays. Available 
20 immunoassays are well known in the art and are extensively described in the patent 
scientific literature. See, for example, U.S. Patent Nos. 3,791,932; 3,839,153; 
3,850,752; 3,850,578; 3,853,987; 3,867,517; 3,879,262; 3,901,654; 3,935,074, 
3,984,533; 3,996,345; 4,034,074; and 4,098,876. 

25 Screening assays for P/CAF 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of P/CAF comprising contacting a 
system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 
30 amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
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amount of histone acetylation by P/CAF in the absence of the substance, a decreased 
amount of histone acetylation by P/CAF in the presence of the substance indicating a 
substance that can inhibit the histone acetyltransferase activity of P/CAF. The 
acetylation of histones by P/CAF can be determined in a system including, for example, 
5 either core histones (histones H2A, H2B, H3 and H4) or the nucleosome core particles 
(146 base pairs of DNA wrapped around the octamer of core histones) as substrates, the 
P/CAF protein and radiolabeled acetyl-CoA (e.g., [l- I4 C]acetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 
as described herein in the Examples. Thus, the compound to be tested for the ability to 
10 inhibit the histone acetyltransferase activity of P/CAF can be added to this system and 
assayed for inhibiting ability. 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the transcription modulating activity of P/CAF, comprising contacting a 

15 system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 
amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
amount of histone acetylation by P/CAF in the absence of the substance, a decreased 

20 amount of histone acetylation by P/CAF in the presence of the substance indicating a 

substance that can inhibit the transcription modulating activity and cell cycle progression 
suppressing activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 

25 octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 
(e.g., [l- 14 C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
Thus, the compound to be tested for the ability to inhibit the transcription modulating 
activity of P/CAF by interfering with the histone acetyltransferase activity of P/CAF can 

30 be added to this system and assayed for inhibiting ability. 
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Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 and P/CAF can occur; determining the amount 
5 of p300 binding to P/CAF in the presence of the substance; and comparing the amount 
of p300 binding to P/CAF in the presence of the substance with the amount of p300 
binding to P/CAF in the absence of the substance, a decreased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can inhibit the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
10 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the p300 protein comprising the amino acid sequence of SEQ ID NO: 3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

15 

Additionally provided in the present invention is a bioassay for screening 
substances for the ability to inhibit the binding of CBP to P/CAF, comprising contacting 
a system in which the binding of CBP to P/CAF can be determined, with the substance 
under conditions whereby the binding of CBP to P/CAF can occur; determining the 
20 amount of CBP binding to P/CAF in the presence of the substance, and comparing the 
amount of CBP binding to P/CAF in the presence of the substance with the amount of 
CBP binding to P/CAF in the absence of the substance, a decreased amount of CBP 
binding to P/CAF in the presence of the substance indicating a substance that can inhibit 
the binding of CBP to P/CAF. The binding of CBP to P/CAF can be determined in a 
25 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the CBP protein comprising the amino acid sequence of SEQ ID NO: 9 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both CBP and P/CAF. Determination of the binding of CBP to P/CAF can be 
carried out as taught herein. 
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The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of P/CAF comprising 
contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
5 the substance, and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 
presence of the substance indicating a substance that can stimulate the histone 
acetyltransferase activity of P/CAF. The acetylation of histones by P/CAF can be 

10 determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl- 
CoA (e g,, [l- 14 C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 

1 5 Thus, the compound to be tested for the ability to stimulate the histone acetyltransferase 
activity of P/CAF can be added to this system and assayed for stimulating ability. 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the transcription modulating activity of P/CAF comprising 

20 contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
the substance; and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 

25 presence of the substance indicating a substance that can stimulate the transcription 

modulating activity of P/CAF, The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 
octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 

30 (e.g., [l- l4 C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
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Thus, the compound to be tested for the ability to stimulate the transcription modulating 
activity of P/CAF by increasing the histone acetyltransferase activity of P/CAF can be 
added to this system and assayed for stimulating ability. 

5 The present invention further provides a bioassay for screening substances for 

the ability to stimulate binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 to P/CAF can occur; determining the amount of 
p300 binding to P/CAF in the presence of the substance; and comparing the amount of 
10 p300 binding to P/CAF in the presence of the substance with the amount of p300 

binding to P/CAF in the absence of the substance, an increased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can stimulate the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
system, for example, which can include a cell free reaction mixture comprising a 

15 fragment of the p300 protein comprising the amino acid sequence of SEQ ID NO: 3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

20 Additionally provided in the present invention is a bioassay for screening 

substances for the ability to stimulate the binding of CBP to P/CAF, comprising 
contacting a system in which the binding of CBP to P/CAF can be determined, with the 
substance under conditions whereby the binding of CBP to P/CAF can occur; 
determining the amount of CBP binding to P/CAF in the presence of the substance; and 

25 comparing the amount of CBP binding to P/CAF in the presence of the substance with 
the amount of CBP binding to P/CAF in the absence of the substance, an increased 
amount of CBP binding to P/CAF in the presence of the substance indicating a 
substance that can stimulate the binding of CBP to P/CAF. The binding of CBP to 
P/CAF can be determined in a system, for example, which can include a cell free 

30 reaction mixture comprising a fragment of the CBP protein comprising the amino acid 
sequence of SEQ ID NO:9 and P/CAF. Alternatively, the system can comprise a cell 
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extract produced from cells producing both CBP and P/CAF. Determination of the 
binding of CBP to P/CAF can be carried out as taught herein. 



Transcription modulating activity of P/CAF 

5 The present invention contemplates a method for inhibiting the transcription 

modulating activity of P/CAF in a subject, comprising administering to the subject a 
transcription modulating activity inhibiting amount of a substance in a pharmaceutically 
acceptable carrier. For example, the substance can be identified according to the 
protocols provided herein as one that can inhibit the transcription modulating activity of 

10 P/CAF by preventing the binding of P/CAF to p300/CBP or by inhibiting the histone 
acetyltransferase activity of P/CAF as well as by any other inhibitory mechanism as 
identified by the protocols provided herein. Inhibition of the transcription modulating 
activity of P/CAF in a subject is desirable, for example, to inhibit HIV TAT-mediated 
transcription and therefore, the method of the present invention can be used to treat 

15 HIV-infected subjects. 



The substance can be in a pharmaceutically acceptable carrier. By 
"pharmaceutically acceptable" is meant a material that is not biologically or otherwise 
undesirable, i.e., the material may be administered to a subject, along with the substance, 
20 without causing any undesirable biological effects or interacting in a deleterious manner 
with any of the other components of the pharmaceutical composition in which it is 
contained. The carrier would naturally be selected to minimize any degradation of the 
active ingredient and to minimize any adverse side effects in the subject. 



25 The transcription modulating activity and/or histone acetyltransferase activity of 

P/CAF can be inhibited in a subject by administering to the subject a substance which 
binds p300/CBP at the P/CAF binding site or a substance which binds the P/CAF 
protein at the p300/CBP binding site, the ultimate result being that P/CAF and 
p300/CBP do not bind with one another and P/CAF cannot exert its transcription 

30 modulating and/or histone acetyltransferase effect. The substance can be a protein, such 
as an antibody which binds the P/CAF protein binding site at or near the p300/CBP 
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binding site, thereby preventing its binding or an antibody which binds the p300/CBP 
protein at or near the P/CAF binding site, thereby preventing its binding. The substance 
can also bind the histone acetyltransferase site on P/CAF or at the acetylation site on the 
histone, thereby preventing acetylation by P/CAF. 

The substance which binds p300/CBP, the P/CAF protein or the histone and has 
the net effect of inhibiting the transcription modulating effect and or histone 
acetyltransferase activity of P/CAF in the cell can be delivered to a cell in the subject by 
mechanisms well known in the art. 

> 

Alternatively, a nucleic acid encoding a protein which binds either to p300/CBP 
or the P/CAF protein and has the net effect of inhibiting the transcription modulating 
effect and/or histone acetyltransferase activity of P/CAF in the cell can be delivered to a 
cell in the subject by gene transduction mechanisms well known in the art. For example, 
5 nucleic acid can be introduced by liposomes as well as via retroviral or adeno-associated 
viral vectors, as described below. 



The substance which inhibits the transcription modulating effect and/or histone 
acetyltransferase activity of P/CAF can be an antisense RNA or an antisense DNA which 
20 binds the RNA or DNA of P/CAF, thereby preventing translation or transcription of the 
RNA or DNA encoding P/CAF and having the net effect of inhibiting the transcription 
modulating effect and/or histone acetyltransferase activity of P/CAF by inhibiting P/CAF 
production. The antisense RNA of the present invention can be generated from the 
nucleic acid of SEQ ID NO: 14 (human) or SEQ ID NO: 15 (mouse). Furthermore, the 
25 antisense DNA can be a phosphorothioate oligodeoxyribonucleotide having the 

nucleotide sequence of SEQ ID NO: 16 (human) or of SEQ ID NO: 1 7 (mouse) The 
mouse antisense RNA can be used to inhibit the activity of mouse P/CAF, having the 
nucleotide sequence of SEQ ID NO: 18 and the amino acid sequence of SEQ ID NO: 8 
The present invention also contemplates an antisense nucleic acid sequence which can 
30 bind the DNA or RNA of any of the transcription factors or other proteins now known 
or later identified to bind P/CAF, thereby inhibiting expression of the gene products of 
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these proteins and having the net effect of inhibiting the transcription modulating effect 
and/or histone acetyltransferase activity of P/CAF. 

The antisense nucleic acid can comprise a typical nucleic acid, but the antisense 
5 nucleic acid can also be a modified nucleic acid or a derivative of a nucleic acid such as a 
phosphorothioate analogue of a nucleic acid. The composition can comprise, for 
example, an antisense RNA that specifically binds an RNA encoded by the gene 
encoding the serum protein. Antisense RNAs can be synthesized and used by standard 
methods (62). 

10 

Antisense RNA can inhibit gene expression by forming an RNA/RNA duplex 
between the antisense RNA and the RNA transcribed from the target gene. The precise 
mechanism by which this duplex formation decreases the production of the protein 
encoded by the endogenous gene probably involves binding of complementary regions 

15 of the normal sense mRNA and the antisense RNA strand with duplex formation in a 
manner that blocks RNA processing and translation. Alternative mechanisms include 
the formation of a triplex between the antisense RNA and duplex DNA or the formation 
of an DNA-RNA duplex with subsequent degradation of DNA-RNA hybrids by RNAse 
H. Furthermore, an antigene effect can result from certain DNA-based oligonucleotides 

20 via triple-helix formation between the oligomer and double-stranded DNA which results 
in the repression of gene transcription. Regardless of the specific molecular mechanism, 
the present invention results in inhibition of expression of the P/CAF gene by the 
introduced and replicated DNA resulting in inhibition of the transcription modulating 
and/or histone acetyltransferase activity of P/CAF, by a reduction in the expression of 

25 the nucleic acid to which the antisense nucleic acid is hybridized, and therefore a 
reduction of the gene product from the targeted gene. 

The antisense nucleic acid may be obtained by any number of techniques known 
to one skilled in the art. One method of constructing an antisense nucleic acid is to 
30 synthesize a recombinant antisense DNA molecule. For example, oligonucleotide 

synthesis procedures are routine in the art and oligonucleotides coding for a particular 
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protein or regulatory region are readily obtainable through automated DNA synthesis 
A nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 
5 5 ! or 3' overhangs at the termini for cloning into an appropriate vector. Double-stranded 
molecules coding for relatively large proteins or regulatory regions can be synthesized 
by first constructing several different double-stranded molecules that code for particular 
regions of the protein or regulatory region, followed by ligating these DNA molecules 
together. Once the appropriate DNA molecule is synthesized, this DNA can be cloned 
10 downstream of a promoter in an antisense orientation. Techniques such as this are 
routine in the art and are well documented. 

An example of another method of obtaining an antisense nucleic acid is to isolate 
that nucleic acid from the organism in which it is found and clone it in an antisense 

15 orientation. For example, a DNA or cDNA library can be constructed and screened for 
the presence of the nucleic acid of interest. Methods of constructing and screening such 
libraries are well known in the art and kits for performing the construction and screening 
steps are commercially available (for example, Stratagene Cloning Systems, La Jolla, 
CA). Once isolated, the nucleic acid can be directly cloned into an appropriate vector in 

20 an antisense orientation, or if necessary, be modified to facilitate the subsequent cloning 
steps. Such modification steps are routine, an example of which is the addition of 
oligonucleotide linkers which contain restriction sites to the termini of the nucleic acid 
General methods are set forth in Sambrook et al (39). 

25 The DNA that is introduced into the cell is in an expression orientation that is 

antisense to a corresponding endogenous DNA or RNA of the cells. For example, 
where an endogenous DNA comprises a gene which encodes for a particular protein, the 
introduced DNA is in an expression orientation opposite the expression of the 
endogenous DNA; that is the DNA operatively linked to a promoter is in an antisense 

30 expression orientation relative to the corresponding endogenous gene. The introduced 
DNA may be homologous to the entire transcribed gene or homologous to only part of 
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the transcribed gene. Alternatively, the sequence of the introduced DNA may be 
divergent to that of the endogenous DNA but only divergent to the extent that 
hybridization of the nucleic acids occurs, thereby preventing transcription. One skilled 
in the art can determine the maximum extent of this divergence by routine screening of 
5 antisense DNAs corresponding to an endogenous DNA of the cell. In this manner, one 
skilled in the art can readily determine which fragments, or alternatively the extent of 
homology of the fragments or the entire gene that is necessary to inhibit gene 
expression. 

10 The antisense nucleic acids of the present invention can be made according to 

protocols standard in the art, as well as described in the Examples provided herein. The 
antisense nucleic acids can be administered to a subject according to the gene 
transduction protocols standard in the art, as described below. 

1 5 The present invention also contemplates a method for stimulating the 

transcription modulating activity and/or histone acetyltransferase activity of P/CAF in a 
subject comprising administering to the subject a substance, in a pharmaceutical^ 
acceptable carrier, determined according to the methods taught herein, to have a 
stimulatory affect on the transcription modulating and/or histone acetyltransferase 

20 activity of P/CAF. The substance can be one which has been identified, according to the 
protocols provided herein, to stimulate histone acetyltransferase activity in P/CAF or 
promote binding of P/CAF to p300/CBP. The stimulation of the transcription 
modulation activity and/or histone acetyltransferase activity of P/CAF in a subject is 
desirable, for example, to activate tumor suppressor p53 (which promotes apoptosis) or 

25 to activate the muscle differentiation factor, MyoD. Thus, the method of the present 
invention can be employed to treat cancer and to promote muscle differentiation in 
conditions where muscle differentiation is desired. The substance can be delivered to a 
cell in the subject by mechanisms well known in the art. 

30 Further contemplated in the present invention is a method for promoting binding 

of P/CAF to p300/CBP in a subject, comprising administering to the subject a substance 
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identified by the methods provided herein to promote binding of P/CAF to either p300 
or CBP 

Additionally, a nucleic acid encoding a protein which stimulates the transcription 
5 modulating activity and/or histone acetyltransferase activity of P/CAF can be delivered 
to a cell in the subject by gene transduction mechanisms, as described below. 

Also provided in the present invention is a method of inhibiting the cell cycle 
progression inducing effect of an oncoprotein which binds p300/CBP in a subject 

10 comprising transducing the cells of the subject with a vector comprising a nucleic acid 
encoding the P/CAF protein; inducing expression of the nucleic acid in the cell to 
produce the P/CAF in an amount which will allow the P/CAF gene product to replace 
the oncoprotein bound to p300/CBP, whereby the replacement of the oncoprotein 
bound to p300/CBP by the P/CAF gene product inhibits the cell cycle progression 

1 5 inducing effect of the oncoprotein. The oncoprotein which binds p300/CBP in the cell 
can be the adenovirus El A oncoprotein. 



A method for providing a functional P/CAF protein to a subject in need of the 
functional P/CAF protein is also provided, comprising transducing the ceils of the 

20 subject with a vector comprising a nucleic acid encoding the P/CAF protein and 

inducing expression of the nucleic acid to produce the functional P/CAF protein in the 
cell, thereby providing the functional P/CAF protein to the subject. The transduction of 
the vector nucleic acid into the subject's cells can be carried out according to standard 
gene therapy protocols well known in the art (see, for example, U.S. Patent No. 

25 5,339,346). 



Screening assays for p300/CBP 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of p300/CBP comprising 
30 contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance under conditions whereby histone acetylation by p300/CBP can occur; 
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determining the amount of histone acetylation by p300/CBP in the presence of the 
substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased amount of histone acetylation by p300/CBP in the 
5 presence of the substance indicating a substance that can inhibit the histone 

acetyltransferase activity of p300/CBP. The acetylation of histones by p300/CBP can be 
determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P300/CBP protein and radiolabeled 
10 acetyl-CoA (e.g., [l- 14 C]acetyl CoA). The presence of acetylated histones can be 

detected by autoradiography after separation by SDS-PAGE as described herein in the 
Examples. Thus, the compound to be tested for the ability to inhibit the histone 
acetyltransferase activity of p300/CBP can be added to this system and assayed for 
acetyltransferase inhibiting ability. 

15 

Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of a transcriptional factor to p300/CBP, comprising 
contacting a system in which the binding of a transcriptional factor to p300/CBP can be 
determined, with the substance under conditions whereby the binding of the 

20 transcriptional factor and p300/CBP can occur, determining the amount of 

transcriptional factor binding to p300/CBP in the presence of the substance; and 
comparing the amount of transcriptional factor binding to p300/CBP in the presence of 
the substance with the amount of transcriptional factor binding to p300/CBP in the 
absence of the substance, a decreased amount of transcriptional factor binding to 

25 p300/CBP in the presence of the substance indicating a substance that can inhibit the 
binding of a transcriptional factor to p300/CBP. The binding of a transcriptional factor 
to p300/CBP can be determined in a system, for example, which can include a cell free 
reaction mixture comprising a transcriptional factor which binds p300/CBP and 
p300/CBP. Alternatively, the system can comprise a cell extract produced from cells 

30 producing both a transcriptional factor which binds p300/CBP and p300/CBP. The 
transcriptional factor which binds p300/CBP can be selected from, but is not limited to 
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the group consisting of nuclear hormone receptors, CREB, c-Jun/v-Jun, c-Myb/v-Myb, 
YYI, Sap- la, c-Fos, MyoD and SRC-1, as well as any other transcriptional factor now 
known or later identified to bind p300/CBP. The screening assay of the present 
invention can also be used to identify substances which inhibit the binding of p300/CBP 
5 to other components to which it is known to bind, for example, P/CAF, pp^^, TFIIB, 
El A, SV40 large T antigen, as well as any other substances now known or later 
identified to bind p300/CBP. Determination of the binding of a transcriptional factor or 
other substance to p300/CBP can be carried out as taught in the Examples herein as well 
as by protocols described in the literature. 

10 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of p300/CBP comprising 
contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance; determining the amount of histone acetylation by p300/CBP in the 

15 presence of the substance; and comparing the amount of histone acetylation by 

p300/CBP in the presence of the substance with the amount of histone acetylation by 
p300/CBP in the absence of the substance, an increased amount of histone acetylation 
by p300/CBP in the presence of the substance indicating a substance that can stimulate 
the histone acetyltransferase activity of p300/CBP. The acetylation of histones by 

20 p300/CBP can be determined in a system including, for example, either core histones 
(histones H2A, H2B, H3 and H4) or the nucleosome core particles (146 base pairs of 
DNA wrapped around the octamer of core histones) as substrates, the p300/CBP 
protein and radiolabeled acetyl-CoA (e.g., [l- 14 C]acetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 

25 as described herein in the Examples. Thus, the compound to be tested for the ability to 
stimulate the histone acetyltransferase activity of p300/CBP can be added to this system 
and assayed for stimulating ability. 

The present invention further provides a bioassay for screening substances for 
30 the ability to stimulate binding of a component, which binds p300/CBP, to p300/CBP, 
comprising contacting a system in which the binding of the component to p300/CBP can 
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be determined, with the substance under conditions whereby the binding of the 
component to p300/CBP can occur; determining the amount of component binding to 
p300/CBP in the presence of the substance; and comparing the amount of component 
binding to p300/CBP in the presence of the substance with the amount of component 
5 binding to p300/CBP in the absence of the substance, an increased amount of 

component binding to p300/CBP in the presence of the substance indicating a substance 
that can stimulate the binding of the component to p300/CBP. The binding of the 
component to p300/CBP can be determined in a system, for example, which can include 
a cell free reaction mixture comprising the component and p300/CBP. Alternatively, the 

10 system can comprise a cell extract produced from cells producing both the component 
and p300/CBP. The component which binds p300/CBP can be any of the transcriptional 
factors or other proteins which are known or are identified in the future to bind 
p300/CBP, as set forth above. Determination of the binding of the component to 
p300/CBP can be carried out as taught in the Examples provided herein and according 

15 to protocols available in the literature. 



Histone acetyltransferase activity of p300/CBP 

A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject is provided in the present invention, comprising administering to the subject a 
20 histone acetyltransferase activity inhibiting amount of a substance in a pharmaceutical^ 
acceptable carrier. The mechanism of the inhibitory action of the substance can be the 
inhibition of the binding of a DNA-binding transcription factor, such as, for example, a 
nuclear hormone receptor, CREB, c-Jun/v-Jun, c-Myb/v-Myb, YY1, Sap- la, c-Fos, 
MyoD or SRC-1, to p300/CBP. 

25 

The histone acetyltransferase activity of p300/CBP can be inhibited in a subject 
by administering to the subject a substance which binds p300/CBP at the transcription 
factor binding site or a substance which binds the transcription factor protein at the 
p300/CBP binding site, the ultimate result being that the transcription factor and 
30 p300/CBP do not bind with one another and p300/CBP cannot acetylate histones. 
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The substance which binds either to the transcription factor or the p300/CBP 
protein and has the net effect of inhibiting the histone acetyltransferase activity of 
p300/CBP in the cell can be identified according to the screening methods provided 
herein and delivered to a cell in the subject by mechanisms well known in the art. The 
5 substance can be a protein, such as an antibody which binds the p300/CBP protein 
binding site at or near the DNA-binding transcription factor binding site, thereby 
preventing its binding or an antibody which binds the DNA-binding transcription factor 
at or near the p300/CBP binding site, thereby preventing its binding. The substance can 
also bind the histone acetyltransferase site on p300/CBP (aa 1 195-1673 on p300 or aa 
10 11 74-1 850 on CBP) or at the acetylation site on the histone, thereby preventing 
acetylation by p300/CBP. 

Additionally, the substance can be a nucleic acid which can be expressed in the 
cell to produce a protein which inhibits the histone acetyltransferase activity of 

15 p300/CBP. For example, a nucleic acid encoding a protein which binds either to a 
transcription factor or the p300/CBP protein and has the net effect of inhibiting the 
histone acetyltransferase activity of p300/CBP in the cell can be delivered to a cell in the 
subject by gene transduction mechanisms well known in the art. For example, nucleic 
acid can be introduced by liposomes as well as via retroviral or adeno-associated viral 

20 vectors, as described below. 

The substance which inhibits the histone acetyltransferase activity of p300/CBP 
can be an antisense RNA or an antisense DNA which binds the RNA or DN A of 
p300/CBP thereby preventing translation or transcription of the RNA or DNA encoding 
25 p300/CBP and having the net effect of inhibiting the histone acetyltransferase activity of 
P/CAF by inhibiting p300/CBP production. The antisense RNA or DNA of the present 
invention can be produced and introduced into cells according to the same methods as 
set forth above for P/CAF antisense nucleic acids. 

30 The present invention also contemplates a method for stimulating the histone 

acetyltransferase activity of p300/CBP in a subject comprising administering to the 
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subject a histone acetyltransferase activity stimulating amount of a substance, in a 
pharmaceutical^ acceptable carrier, determined according to the methods taught herein, 
to have a stimulatory affect on the histone acetyltransferase activity of p300/CBP. The 
substance can exert a stimulatory effect by promoting the binding of a DNA-binding 
5 transcription factor of the present invention to p300/CBP. The substance can be 
delivered to a cell in the subject by mechanisms well known in the art. A nucleic acid 
encoding a protein which stimulates the transcription modulating activity of p300/CBP 
can be delivered to a cell in the subject by gene transduction mechanisms, as described 
below. 

10 

Gene transduction 

In the methods described above which include gene transduction into cells (i.e., 
addition of exogenous DNA into cells), the nucleic acids of the present invention can be 
in a vector for delivering the nucleic acids to the site for expression of the P/CAF 

1 5 protein. The vector can be one of the commercially available preparations, such as the 
pGM plasmid (Promega). Vector delivery can be by liposome, using commercially 
available liposome preparations or newly developed liposomes having the features of the 
present liposomes. Additionally, vector delivery can be via a viral system, including, but 
not limited to, retroviral, adenoviral and adeno-associated viral systems. Other delivery 

20 methods can be adopted and routinely tested according to the methods taught herein. 

The modes of administration of the liposome will vary predictably according to 
the disease being treated and the tissue being targeted. For example, for treating cancer 
in either the lung or the liver, which are both sinks for liposomes, intravenous delivery is 

25 reasonable For other localized cancers, as well as precancerous conditions, 

catheterization of an artery upstream from the target organ is a preferred mode of 
delivery, because it avoids significant clearance of the liposome by the lung and liver. 
For cancerous lesions at a number of other sites (e.g., skin cancer, localized dysplasias), 
topical delivery is expected to be effective and may be preferred, because of its 

30 convenience. 
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Leukemias and other disorders involving dysregulated proliferation of certain 
isolatable cell populations may be more readily treated by ex vivo administration of the 
nucleic acid. 



5 The liposomes may be administered topically, parenterally (e.g., intravenously), 

by intramuscular injection, by intraperitoneal injection, transdermal^, extracorporeal^ 
or the like, although intravenous or topical administration is typically preferred The 
exact amount of the liposomes required will vary from subject to subject, depending on 
the species, age, weight and general condition of the subject, the severity of the disease 

10 being treated, the particular compound used, its mode of administration and the like 
Thus, it is not possible to specify an exact amount. However, an appropriate amount 
may be determined by one of ordinary skill in the art using only routine experimentation 
given the teachings herein. 

15 Parenteral administration, if used, is generally characterized by injection. 

Injectables can be prepared in conventional forms, either as liquid solutions or 
suspensions, solid forms suitable for solution of suspension in liquid prior to injection, or 
as emulsions. A more recently revised approach for parenteral administration involves 
use of a slow release or sustained release system such that a constant level of dosage is 

20 maintained. See, e.g., U.S. Patent No. 3,610,795, which is incorporated by reference 
herein. 



Topical administration can be by creams, gels, suppositories and the like Ex 
vivo (extracorporeal) delivery can be as typically used in other contexts. 

The present invention is more particularly described in the following examples 
which are intended as illustrative only since numerous modifications and variations 
therein will be apparent to those skilled in the art 



30 
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EXAMPLES 

I. P/CAF studies. 

5 Cloning and characterization of P/CAF protein. 

In human cells, CBP binds to c-Jun in a phosphorylation-dependent manner in 
association with stimulation of transcription (9). In yeast, GCN4 is believed to be a c- 
Jun counterpart on the basis of similarities in DNA recognition (15) as well as the 
participation of both proteins in UV signaling pathways (16). Yeast genetic screening 

10 has led to the isolation of various cofactors for GCN4, including GCN5 (yGCNS), 
ADA2 (yADA2) and AD A3 (yADA3) (17-19). These factors are considered to 
function as a complex (or in a common pathway) based on genetic and protein-protein 
interaction studies (18-22). Finally, p300/CBP and yADA2 exhibit significant sequence 
similarity within a 50 amino acid region including a Zn 2+ finger motif (3). Human 

1 5 counterparts to yGCN5, yADA2, or yADA3 that interact with p300/CBP to mediate 
transcriptional activation by c-Jun were searched for in various nucleotide sequence 
databases. 

Comparison of the yGCN5 protein sequence with various databases (23) 
20 revealed significant similarities with the two randomly sequenced human cDNAs, 

ETS05039 (24) (P=4.0xl0" 15 ) and NIB2000-5R (P-6.5xl0* 9 ). Given that these cDNAs 
were truncated, human fetal liver and fetal brain cDNA libraries (Clontech) were 
screened with ETS05039 and NIB2000-5R, respectively and complete clones were 
isolated from the human fetal liver cDNA library. The complete sequences revealed that 
25 the ETS05039- and NIB2000-5R-derived clones are encoded by distinct genes but are 
highly related within the protein coding regions (68% identity at the DNA level; 75% 
identity and 86% similarity at the protein level). The former encodes an N-terminal 
region with no sequence similarity to any proteins in the databases besides the yGCN5- 
related C-terminal region, whereas the latter encodes only the yGCN5-related region. 
30 Given that p300/CBP-binding activity was observed in the former polypeptide as shown 
below, it was designated p300/CBP-associated factor (P/CAF), having the amino acid 
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sequence of SEQ ID NO: 1 and the nucleotide sequence of SEQ ID NO: 10 and the latter 
was named human GCN5 (hGCN5), having the amino acid sequence of SEQ ID NO: 5 
and the nucleotide sequence of SEQ ID NO: 1 1 

5 Additionally, an RNA blot (Clontech) was hybridized with a random-primed 

probe made from the cDNA encoding P/CAF. RNA blotting indicated that transcripts 
detected by the P/CAF and hGCN5 cDNAs are ubiquitously expressed, but the former is 
most abundant in heart and skeletal muscle, whereas the latter is most abundant in 
pancreas and skeletal muscle. 

10 

P/CAF-p300/CBP interaction in vitro 

The P/CAF binding site was presumed to reside in the C terminal one third of 
CBP (residues 1,678-2,442) because it was observed that this region, when fused to a 
DN A binding domain, activates transcription (4) in a manner repressed by coexpression 
15 of 1 2S E 1 A. This region was divided into 6 overlapping fragments and each was 
expressed in£. coli as a glutathione-S-transferase (GST) fusion protein. GST-CBP 
fusions were incubated with recombinant P/CAF protein and, subsequently, purified 
using glutathione-Sepharose. Co-purified P/CAF was detected by immunoblotting 
analysis. 

20 

To construct GST-fusions, various regions of CBP and p300 were amplified by 
PCR. A series of deletions of the CBP segment B was created by site-directed in vitro 
mutagenesis (30). These fragments were subcloned into pGEX-2T (Pharmacia). GST- 
fusions were expressed in E. coli and extracted with buffer B [20 mM Tris-HCl (pH 

25 8.0), 5 mM MgCl 2 , 10% glycerol, 1 mM AEBSF, 0. 1% NP40, 10 ug/ml of aprotinin, 10 
ug/ml of leupeptin, 1 ug/ml of pepstatin A, 1 mM DTT] containing 0.1 M KC1 for these 
experiments. GST-CBP-segment B was purified by glutathione-Sepharose and phenyl- 
Sepharose chromatographic steps, P/CAF, hGCN5, and El A were expressed as FLAG- 
fusions in Sf9 cells via baculovirus vectors and affinity-purified with M2-agarose (ref. 

30 30; Kodak-IBI). For interaction, a crude E. coli extract containing 20 pmol of GST- 
fusion was incubated with 40-60 pmol of P/CAF or El A in a total volume of 50 ul of 
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buffer B with 0.1M KC1 on ice for 1 0 min. Samples were further incubated with 1 0 jil 
(packed volume) of glutathione-Sepharose at 4°C for 30 min, washed four times with 
200 til of buffer B containing 0.1 M KC1, and eluted with 20 ^il of buffer E [50 mM 
Tris-HCl (pH 8.0), 0.2 M KC1, 20 mM glutathione] for 60 min. Interacting proteins 
5 were detected by anti-FLAG immunoblotting or silver staining. 

For p300 interactions, the segment spanning residues 1763-1966 (segment B*) of 
p300, which is analogous to the CBP segment-B, was used. Twenty percent of the 
P/CAF and hGCN5 inputs and 100% of the El A input were also analyzed. In the GST 
10 precipitation assays, almost identical amounts of the GST fusions were recovered in all 
samples. Interaction between P/CAF and CBP (segment B) was determined in the 
absence and in the presence of El A. Control reactions with GST-CBP alone and 
without GST-CBP were also performed. Input proteins were analyzed. 

15 Two CBP segments, A and B, interacted specifically with P/CAF. The stronger 

interaction was observed in the latter segment, which does not include the yADA2-like 
Zn 2+ finger. Given that the CBP segment-B is well conserved in p300 (66% identity, 
75% similarity), the binding of P/CAF to p300 in vitro was also analyzed. For this 
experiment, the p300 segment spanning residues 1763-1966, termed segment B\ which 

20 is analogous to the CBP segment-B, was used. Like CBP, p300 interacted specifically 
with P/CAF. These studies demonstrated that P/CAF binds specifically to both p300 
and CBP in vitro. In contrast to P/CAF, hGCN5 did not bind to CBP or p300. 

These studies also demonstrated that the Zn 2+ finger region of p300/CBP, which 
25 shares sequence similarity with yADA2, is not essential for the interaction with P/CAF 
Cloning of a human structural homolog of yADA2, termed hADA2 (25) has revealed 
that, unlike the sequence similarity between p300/CBP and yADA2, which is restricted 
to a 50 amino acid region, hADA2 shares extensive similarity (30% identity, 52% 
similarity) to yADA2 over the entire protein sequence. Moreover, a computer search of 
30 the complete genomic sequence of Saccharomyces cerevisiae revealed that yeast does 
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not have counterparts of p300/CBP or P/CAF. Thus, the p300/CBP-P/CAF pathway 
may have been acquired during metazoan evolution. 



5 Action of El A in vitro 

Previous reports indicated that El A binds to both the p300 segment spanning 
residues 1767-1816 and the CBP segment spanning residues 1805-1854 (7). These 
interactions were reconfirmed in the present system; thus, both p300 and CBP segments 
covering the previously identified regions interacted with El A. 

10 

For further mapping, a series of deletions was introduced within the CBP 
segment-B and tested for interactions with P/CAF and El A. Deletions of residues 
1801-1825 or 1824-1851 markedly reduced interactions with both P/CAF and E1A, 
whereas deletion of residues 1850-1878 did not affect these interactions. Furthermore, 
1 5 deletion of residues 1 80 1 - 1 85 1 completely abolished interactions with both P/CAF and 
El A. These data indicate that residues 1801-1851 of CBP are critical for interaction 
with both P/CAF and El A. Taken together with the evidence that CBP segment A (aa 
residues 1,678-1,880) also binds to these factors, the above findings demonstrate that 
P/CAF and El A bind to the same or very closely spaced sites on CBP. 

20 

Evidence that both P/CAF and El A recognize the same p300/CBP segments 
raises the possibility of direct competition between P/CAF and El A for binding to 
p300/CBP. To test this possibility, a competition experiment was performed with the 
use of affinity purified recombinant proteins. The interaction of P/CAF with the CBP- 
25 segment B was progressively inhibited by the addition of increasing amounts of El A In 
contrast, no inhibition was caused by an El A mutant which does not bind to p300/CBP 
(E1AAN). Similar results were obtained with the p300-segment B', leading to the 
conclusion that P/CAF and El A compete for the same binding sites in p300/CBP. 



30 
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P/CAF-p300/CBP interaction in vivo 

The in vivo interaction between P/CAF and p300/CBP was established by co- 
immunoprecipitation from a human osteosarcoma cell extract. Proteins in this extract 
were immunoprecipitated with rabbit anti-P/CAF, rabbit anti-CBP and anti-p300 
5 antibodies. For controls, cell extract was precipitated with rabbit control IgG or mouse 
anti-HA monoclonal antibody. The precipitates were analyzed by immunoblotting with 
anti-P/CAF, anti-CBP and anti-p300 antibodies. 



Osteosarcoma cells were transfected with either control vector or El A- or 
10 ElAAN-expression vectors. Extract from the transfected subpopulation was 

immunoprecipitated with anti-P/CAF or control IgG. The precipitates were analyzed by 
immunoblotting with anti-p300 and anti-P/CAF antibodies. 

Rabbit anti-P/CAF antibody was raised to the P/CAF segment spanning residues 

15 125-397 and purified by immunoaffinity chromatography (33). A mixture of 

monoclonal antibodies raised to the human p300 segment spanning residues 1572-2371 
(5) and rabbit polyclonal antibodies raised to the mouse CBP segment spanning residues 
2-23 (for immunoprecipitation) and 1736-2179 (immunoblotting) were purchased from 
Upstate Biotechnology. Approximately 2 x 10 7 human osteosarcoma U-2 OS cells 

20 (ATCC accession number HTB 96) were extracted with 10 ml of lysis buffer [25 mM 
HEPES-KOH (pH 7.2), 150 mM potassium acetate, 2 mM EDTA, 1 mM DTT, 1 mM 
AEBSF, 10 jig/ml of aprotinin, 10 ng/ml of leupeptin, 1 ^g/ml of pepstatin A, 20 mM 
sodium fluoride, 0. 1% NP40]. Two to 10 ml of extract were incubated with 2 (ig of the 
respective antibody for four hours at 4°C. Fifty \x\ (packed volume) of protein-A 

25 Trisacryl (Pierce) were added and incubation was continued for two hours. The matrix 
was washed four times with 1 ml of the lysis buffer, then boiled in 2x SDS sample 
buffer. Human osteosarcoma U-2 OS cells were transfected with 20 \ig of the indicated 
plasmid and 1 |ig of sorting plasmid (pCMV-IL2R) (31). The transfected subpopulation 
was purified by magnetic affinity cell sorting (32). Extract from approximately 2 x 10 5 

30 sorted cells was immunoprecipitated as described. 



WO 98/03652 PCT7US97/12877 

46 

Anti-P/CAF antibody specifically detected a 95 kDa protein, which is very close 
to the calculated value for the full-length P/CAF, in the immunoprecipitates Anti- 
P/CAF antibody co-immunoprecipitated both CBP and p300. Similarly, anti-CBP 
antibody also co-immunoprecipitated P/CAF. However, anti-p300 antibody did not co- 
5 immunoprecipitate P/CAF. This is most likely due to steric interference since the anti- 
p300 antibody was raised to the p300 segment spanning residues 1572-2371 which 
includes the P/CAF binding region. These data demonstrate that P/CAF forms 
complexes with both p300 and CBP in vivo. 

10 Action of El A in vivo 

The in vitro experiments described herein indicate that P/CAF and El A compete 
for the binding sites in p300/CBP. Thus, a study was conducted to determine whether 
El A targets the endogenous interaction between P/CAF and p300 An El A-expression 
vector was transiently transfected into human osteosarcoma cells and the transfected 

15 subpopulation was purified by cell sorting. Then, the interaction between P/CAF and 
p300 in transfected cells was examined by co-immunoprecipitation with anti-P/CAF 
antibody. The endogenous interaction of P/CAF with p300 was drastically inhibited by 
expression of El A. On the other hand, no inhibition was observed by the El A mutant 
lacking the p300 binding domain (E1AAN), indicating that El A disrupts the P/CAF- 

20 p300 complex in vivo through an interaction with p300 

Cell cycle regulation by P/CAF 

Given that binding of P/CAF to p300/CBP is inhibited by El A, experiments 
were performed to evaluate whether P/CAF, by binding to and forming a functional 
25 complex with p300, is involved in the regulation of entry into S phase. This possibility 
was addressed by examining whether transient expression of P/CAF would affect the 
rate of Gl/S transit in HeLa cells. P/CAF negatively affected the distribution of cells 
between Gl and S phases in this assay. 

30 HeLa cells were transfected by electroporation with 7 \xg of P/CAF-expression 

plasmid and/or 3 \xg of the full-length or the N-terminally deleted (A2-36) El A 12S- 
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expression plasmid as indicated. These plasmids were constructed by subcloning 
FLAG-P/CAF and El A cDNAs into pCX (34) and pcDNAI (Invitrogen), respectively. 
All samples, in addition, contained 1 of sorting plasmid (pCMV-EL2R) (31) and 
carrier plasmid (pCX) to normalize the total amount of DNA to 1 1 fig. After 
5 transfection, cells were incubated in Dulbecco's modified Eagle's medium with 10% fetal 
bovine calf serum for 12 h, and subsequently labeled in medium containing 10 \xM 
bromo-deoxyuridine (BrdU) for 30 min. Subsequently, the transfected subpopulation 
was purified by magnetic affinity cell sorting and nuclei were analyzed by dual parameter 
flow cytometry as described (32). 

10 

The fraction of cells accumulating in S phase in control cultures was 23%, 
compared to 15% in P/CAF-transfected cells. This effect was reproducible in multiple 
independent experiments. In parallel experiments to verify the utility of this 
experimental protocol, plasmids encoding E2F-1, simian virus 40 small t, cyclin A or 
1 5 cyclin E increased the accumulation of cells in S phase, whereas plasmids encoding the 
cyclin-dependent kinase inhibitors p21 or p27 reduced the number of S phase cells. 

On the basis of evidence that El A and P/CAF compete for binding sites on 
p300, it seemed possible that cotransfection of P/CAF with El A would oppose the 

20 mitogenic effect caused by El A. As shown by the data herein, this is indeed the case. 
El A alone has mitogenic activity in this experimental setting, while the El A mutant 
lacking the p300 binding domain (El AAN) has very weak activity. Comparable 
expression levels between wild type and mutant El A in the transfected cells were 
revealed by immunoblotting analysis with anti-El A. Intriguingly, when P/CAF was 

25 cotransfected with El A, the mitogenic activity of El A was significantly counteracted by 
P/CAF. These results show that P/CAF and El A mediate antagonistic effects on cell 
cycle progression. 

In the course of assessing P/CAF activity, it was also revealed that p300 is able 
30 to inhibit cell cycle progression under the same assay conditions. These findings suggest 
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that P/CAF and p300, perhaps by forming a complex, act in concert to suppress cell 
cycle progression. 

Histone acetyltransferase activity in P/CAF 

Acetylation of the N-terminal histone tails has been considered to play a crucial 
role in accessibility of transcription factors to nucleosomal templates (26-27). Recently, 
yGCN5 has been identified as a histone acetyltransferase (28) On the basis of this 
information, intrinsic histone acetyltransferase activity in P/CAF and hGCN5 was 
examined. As substrates, the core histones (histones H2A, H2B, H3 and H4) and the 
\ nucleosome core particles (146 base pairs of DNA wrapped around the octamer of core 
histones) were used. 

Activity of hGCN5 and P/CAF that acetylates free histones or histones in the 
nucleosome core particle (35) was measured as described (36). Each reaction contained 
5 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol of the histone 
octamer or the nucleosome core particle and 10 pmol of [l- 14 C]acetyl-CoA. The 
histone octamer dissociated into dimers or tetramers under assay conditions. Acetylated 
histones were detected by autoradiography after separation by SDS-PAGE, 

20 P/CAF and hGCN5 acetylated the core histones with almost the same efficiency 

Both factors acetylated histones H3 and H4, but preferentially H3. In contrast, very 
weak or no acetylation by hGCN5 was detected in the nucleosome core particles. 
Remarkably, significant acetylation by P/CAF was observed in a nucleosomal context. 
Although all core histones are acetylated in the nucleus, P/CAF and hGCN5 did not 

25 acetylate histones H2A and H2B in vitro. 

Direct function of P/CAF is likely to involve its intrinsic histone acetyltransferase 
activity. Although exact molecular mechanisms by which acetylation of core histones 
contribute to transcription remains undefined, acetylation of the histones is considered to 
30 play an important role in transcriptional regulation (26-27). The positively charged N- 
terminal tails of core histones are believed to affect nucleosome structure by interacting 
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with DNA at or near the nucleosome-spacer junction. Acetylation of the histone tails 
presumably destabilizes the nucleosome and facilitates access by regulatory factors. 
Likewise, there is a general correlation between the level of acetylation and 
transcriptional activity of nucleosomal domains. The findings of the present invention 
5 provide insights into the mechanisms of targeted histone acetylation. 

Cellular factor p300/CBP binds to various sequence-specific factors that are 
involved in cell growth and/or differentiation, including CREB (3,4), c-Jun (9), Fos (1 1), 
c-Myb (12) and nuclear receptors (13). P/CAF could stimulate the activation function 
10 of these factors via promoter-specific histone acetylation. The present invention 

demonstrates that El A appears to perturb normal cellular regulation by disrupting the 
connection between p300/CBP and its associated histone acetyltransferase. 

H. P300/CBP studies. 

15 

Purification of El A associated histone acetyltransferase. 

FLAG-epitope tagged El A (or AE1 A) was expressed in Sf9 cells (ATCC 
accession number CRL 171 1) by infecting recombinant baculovirus (43). All purification 
steps were carried out at 4°C. Extract was prepared from infected cells by one cycle of 

20 freeze and thaw in buffer B (20 mM Tris-HCl, pH 8.0; 5 mM MgCl 2 ; 10% glycerol; 1 
mM PMSF; 10 mMP-mercaptoethanol; 0. 1% Tween 20) containing 0. 1 
M KC1 and the complete protease inhibitor cocktail (Boehringer Mannheim). To 
prepare El A-immobilized beads, the extract was incubated with M2 
anti-FLAG antibody agarose (Kodak-IBI) for four hours with rotating and 

25 subsequently washed with the same buffer three times. The resulting beads were 

incubated with HeLa (ATCC accession number CCL 2) nuclear extract for four to eight 
hours and thereafter washed with the same buffer six times. Finally, FL AG-El A was 
eluted from the beads along with associated polypeptides by incubating with the same 
buffer containing 0. 1 mg/ml FLAG peptide. 
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For further purification, eluted polypeptides were dialyzed in 0.05 M KCl-buffer 
B and subsequently loaded onto a SMART Mono Q column (Pharmacia) equilibrated 
with the same 0.05 M KCl-buffer B, After washing, the column was developed with a 
linear gradient of 0.05-1 .0 M KC1 in buffer B. Mono Q fractions were concentrated with 
5 a MICROCON spin-filter (Amicon) and consequently loaded onto a SMART Superdex 
200 column (Pharmacia) equilibrated with 0. 1 M KCl-buffer B. 

Histone acetyltransferase assays 

Filter binding assays were performed as described (80) with minor modifications 
10 Samples were incubated at 30°C for 10-60 minutes in 30 ml of assay buffer containing 
50 mM Tris-HCl, pH 8.0; 10% glycerol; 1 mM DTT; 1 mM PMSF; 10 mM sodium 
butyrate; 6 pmol of [ 3 H]acetyl CoA (4.3 mCi/mmole, Amersham Life Science Inc.), and 
33 mg/ml of calf thymus histones (Sigma Chemical Co.). In experiments where synthetic 
peptides were substituted for core histones, 50 pmol of each peptide were used. After 
5 incubation, the reaction mixture was spotted onto Whatman P-81 phosphocellulose filter 
paper and washed for 30 minutes with 0.2 M sodium carbonate buffer pH 9.2 at room 
temperature with 2-3 changes of the buffer. The dried filters were counted in a liquid 
scintillation counter. 

20 PAGE analysis was done as above except that 90 pmol of [ 14 C]acetyl CoA (55 

mCi/mmole, Amersham Life Science Inc.) and 9 pmol of core histones or 
mononucleosomes were used. Core histones and mononucleosomes were prepared as 
described (35). For trypsin digestion, reaction mixtures were further incubated with 
various amounts of trypsin on ice for 30 minutes. The samples were analyzed on one 

25 dimensional SDS-PAGE gels or two dimensional gels, where the first dimension was an 
acid-urea-PAGE gel (44) and the second dimension was an SDS-PAGE gel. 

Protein expression 

For baculovirus expression, cDNAs corresponding to p300 portions of aa 1-670, 
30 aa 671-1 194 and aa 1 135-2414 were amplified by PCR (EXPAND High Fidelity PCR 
System; Boehringer Mannheim) as KpnI-NotI fragments. The resulting fragments were 
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subcloned into a baculovirus transfer vector having the FL AG-tag sequence (43). The 
recombinant viruses were isolated using the BACULOGOLD system (Pharmingen), 
according to the manufacturer's protocol and were infected into Sf9 cells (ATCC 
accession number CRL 171 1) to express FLAG-p300. Recombinant proteins were 
5 affinity purified with M2 anti-FLAG antibody-immobilized agarose (Kodak-IBI) 
according to the manufacturer's protocol. 

For bacterial expression, cDNAs encoding the p300 portions and the CBP 
portion (aa 1 174-1850) were first subcloned into the baculovirus transfer vector having 
10 the FLAG-tag as described above. Thereafter, the Xhol and NotI fragments encoding 
FLAG-p300 or FLAG-CBP fusions were resubcloned into the E. coli expression vector 
pET-28c (Novagene) digested with Sail and NotI. Recombinant proteins were 
expressed in E. coli BL21(DE3) and affinity purified with M2-antibody agarose. 

15 Histone acetyltransferases that associate with El A 

Although the adenovirus El A 12S protein (El A) inhibits transcription in a 
variety of genes via direct binding to p300/CBP (45), El A also stimulates transcription 
in some contexts (46). Thus, p300/CBP-bound El A was tested to determine whether it 
might recruit histone acetyltransferases or deacetylases to regulate transcription. In 
20 addition, experiments were conducted as described below to determine if p300/CBP per 
se is a histone acetyltransferase. 

Initially, recombinant FLAG-epitope tagged El A was immobilized on 
anti-FLAG antibody beads. Immobilized El A was incubated with a HeLa nuclear 

25 extract for affinity purification of El A-associated polypeptides. FLAG-E1A 
was then eluted from the beads, along with El A-associated polypeptides, by 
incubating with FLAG-peptide. Although El A per se has no histone acetyltransferase 
activity, El A recruited significant amounts of histone acetyltransferase activity from the 
nuclear extract. It is very unlikely that this activity is derived from P/CAF given that 

30 El A and P/CAF cannot bind to p300/CBP simultaneously (43). Consistent with this, no 
P/CAF was detected in these fractions by immunoblotting. 
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The El A N-terminus, a region that is not highly conserved among the various 
adenovirus serotypes, is involved in p300/CBP binding in vivo. Mutations in the 
N-terminal region lead to loss of the ability for p300/CBP binding without affecting RB 

5 binding (1 ,47). Thus, the requirement of the El A N-terminal region for the recruitment 
of histone acetyltransferase activity was tested. In contrast to the wild type, the 
N-terminal deleted form of El A (AN-E1A) recruited only a background level of 
acetyltransferase activity. In agreement with previous reports (47), the AN-E1 A 
showed no ability to interact with p300/CBP, although it still retained the ability to 

10 interact with a variety of other polypeptides, including RB. 

To define the relationship between p300/CBP and histone acetyltransferase 
activity, affinity purified El A-binding polypeptides were separated by Mono Q 
ion-exchange column. Both p300/CBP and the acetyltransferase activity were coeluted 
15 at 140 mM KC1, while most of polypeptides were eluted at 260 mM KC1. The active 
fraction of Mono Q column (-140 mM KC1) was further separated by Superdex-200 gel 
filtration column. Both p300/CBP and the acetyltransferase activity coeluted after the 
void volume, indicating that p300/CBP is involved in the histone acetyltransferase 
activity. 

20 

p300 is a histone acetyltransferase 

The data provided herein indicate that p300 per se, or a polypeptide(s) 
associated with p300, possesses histone acetyltransferase activity. To test the former 
possibility, the acetyltransferase activity of recombinant p300 was measured. p300 was 
25 divided into three fragments, each of which was expressed in and purified from Sf9 cells 
via a baculovirus expression vector. Histone acetyltransferase activity was readily 
detected in the C-terminal fragment containing amino acids 1 135-2414, whereas no 
activity was found in the other fragments, demonstrating conclusively that p300 per se is 
a histone acetyltransferase. 



30 



WO 98/03652 



PCT/US97/12877 



53 

p300/CBP-histone acetyltransferase domain 

To map the histone acetyltransferase domain of p300, a series of deletions 
was prepared. Given the poor conservation of the glutamine-rich region (aa 1815-2414) 
in the C. elegans p300/CBP homolog (6), the p300 fragment encoding aa 1135-1810 
5 was expressed in and purified from E. coli. Importantly, this candidate region of p300 
(aa 1 135-1810) showed significant histone acetyltransferase activity. For further 
mapping within this region, a series of N-terminal deletions was constructed. Deletion 
of 60 residues, resulting in a fragment containing aa 1 195-1810, had no effect on the 
acetyltransferase activity, whereas the deletion of 185 residues, yielding a fragment 
10 comprising aa residues 1320-1810, completely eliminated the acetyltransferase activity. 

Next, a series of C-terminal deletions was analyzed to determine the requirement 
of the P/CAF (or El A) -binding domain. The p300 fragments lacking the El A binding 
domain (aa 1 195-1760, 1 195-1706 and 1 195-1673) still retained the acetyltransferase 

1 5 activity, whereas the further truncated mutant (aa 1 195-1652) completely lost the 

acetyltransferase activity. Consistent with these results, the internal deletion of residues 
1418-1720 showed no acetyltransferase activity. These data demonstrate that the 
histone acetyltransferase domain is located between the bromodomain and the 
El A-binding domain. Given that the histone acetyltransferase domain is highly 

20 conserved between p300 and CBP (91% similarity), the corresponding region of CBP, 
aa residues 1 174-1850, was expressed to confirm the acetyltransferase activity. As 
expected, comparable activity was detected, indicating that both p300 and CBP are 
histone acetyltransferases. 

25 Among various acetyltransferases including histone acetyltransferases GCN5 and 

P/CAF, putative acetyl-CoA binding sites are conserved (48). However, multiple 
alignment analysis (49) showed that the p300/CBP histone acetyltransferase domain 
does not belong to this group. Moreover, comparison of the p300/CBP histone 
acetyltransferase domain with peptide sequence databases (23) showed no sequence 

30 similarity to any other proteins. Accordingly, this invention shows that p300/CBP 

represents a novel class of acetyltransferases in that it does not have the conserved motif 
found among previously described acetyltransferases (48). 
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p300 acetylates all core histones in mononucleosomes 

Substrate specificity for acetylation by p300 was also examined. As substrates, 
histone octamers and mononucleosomes (146 base pairs of DNA wrapped around the 
octamer of core histones) were used. Given that the histone octamer dissociates into 
5 dimers or tetramers under physiological conditions, the histone octamer is referred to 
here as core histones. When core histones were used, p300 acetylated all four proteins, 
but preferentially H3 and H4. More importantly, in a nucleosomal context, p300 
acetylated all four core histones nearly stoichiometrically. In contrast, p300 acetylated 
neither BSA nor lysozyme. 

10 

Hyperacetylated histones are believed to be linked with transcriptionally active 
chromatin (26,27,50,51). Hyperacetylated forms are found in histones H4, H3 and H2B, 
which have multiple acetylation sites in vivo. Thus, the level of acetylation by p300 was 
also tested. 

15 

Mononucleosomes treated with p300 were analyzed by two-dimensional gel 
electrophoresis. A Coomassie blue-stained gel and the corresponding autoradiogram 
showed that a significant amount of histones, especially H4, were hyperacetylated. 
Importantly, acetylation levels by p300 were very close to those of hyperacetylated 
20 histones prepared from HeLa nuclei treated with sodium butyrate, a histone deacetylase 
inhibitor. In contrast, no acetylated forms were detected in the reaction 
without p300. These results indicate that p300 acetylates histones in mononucleosomes 
to the hyperacetylated state by targeting multiple lysine residues. 



p300 acetylates the four lysines in the histone H4 N-terminal tail in vitro which are 
acetylated in vivo 

Lysines at positions 5, 8, 12 and 16 of histone H4 are acetylated in vivo 
(51). Recent studies with yeast histone acetyltransferases demonstrate the 
30 position-specific acetylation by distinct acetyltransferases, i.e., while cytoplasmic 
acetyltransferases for histone deposition and chromatin assembly modify 
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positions 5 and 12, GCN5 modifies positions 8 and 16 (52). Accordingly, the positions 
of acetylation by p300 were also determined. A series of synthetic peptides containing 
acetylated lysines at various positions was used to determine the acetylation 
site-specificity of p300. Consistent with the two-dimensional gel electrophoresis 
5 analysis, the experiments with peptide substrates showed that p300 acetylates all four 
lysines in the histone H4 that are acetylated in vivo. These results are consistent with the 
view that deposition-related diacetylated histones are deacetylated during maturation 
of chromatin (53). 



10 p300 preferentially acetylates the N-terminal histone tail 

Histone acetyltransferases modify specific lysine residues in the N-terminal 
tail of core histones but not the C-terminal globular domain in vivo (26,27,50,5 1). 
Structural models of nucleosomes (54,55,56) suggest that most of the lysine residues in 
the C-terminal globular domain are buried. Therefore, experiments were conducted to 

1 5 examine whether restricted acetylation of the N-terminal tail resulted from the substrate 
specificity of the enzyme or inaccessibility of the enzyme to the core domain in 
nucleosomes. The globular domains of all core histones contain a long helix flanked on 
either side by a loop segment and short helix, termed the "histone fold" (54,55,56). 
The histone fold is involved in formation of the stable H2A-H2B and H3-H4 

20 hetero-dimers, consisting of extensive hydrophobic contacts between the paired 

molecules. Therefore, it is likely that a histone monomer cannot fold properly, thereby 
increasing access of the histone acetyltransferase to the core domain. Based on these 
considerations, experiments were conducted to determine whether p300 acetylates free 
histone H4 in a N-terminal-specific manner. 

25 

Histone H4 was acetylated with p300 and subsequently the histone tail was 
removed by partial digestion with trypsin. The distributions of radioactivity between 
intact and core histones were compared. While the globular core histone domain was 
predominant at the higher trypsin concentrations, radioactivity was detected mostly in 
30 the intact histone. These data demonstrate that p300 preferentially acetylates the 
N-terminal tail of histone H4. 
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5 m. P/CAF interaction with MyoD 

Tissue culture and transfection experiments 

C 2 C 12 mouse cells (ATCC accession number CRL 1772) were grown in 
Dulbecco's modified Eagle medium (DMEM) supplemented with 20% fetal bovine 

10 serum (FBS) until they reached confluence. Differentiation was induced by switching 
medium to differentiation medium (DM), consisting of DMEM containing 2% horse 
serum. C 3 H/10Tl/2 fibroblasts (ATCC accession number CCL 226) were grown in 
DMEM supplemented with 10% FBS. Cells were transfected by the calcium phosphate 
precipitation method. Total amounts of transfected DNA were equalized by empty 

15 vector DNA. After 12 h incubation in medium containing the precipitated DNA, the 
cells were washed and incubated in fresh DMEM containing 10% FBS for an additional 
24 h. Afterwards, differentiation was induced by incubating in DM for 36 to 72 h. 
Chloramphenicol acetyltransferase (CAT) assays were performed as previously 
described (64,69). The quantities of cell extracts used for CAT assays were normalized 

20 top-galactosidase activity by cotransfection of 1 mg of the p-galactosidase expression 
vector, pON260. 

Expression vectors used for transfection experiments are as follows: 
pCX-P/CAF for P/CAF (43), P CMV-bp300 for p300 (65), P CMV-p300 (1869-2414) 
25 (64) and pCMV-p300 (1514-1922) (60) for p300 wild type and mutants; pElA12S, 
pEl A12S R2G, pElA12S D2-36 and pEl A12S D121-130 for E1A wild type and 
mutants (66,67,68); and pEMSV-MyoD for MyoD (64) 



30 



The antisense P/CAF RNA expression vector, pcDNA3 P/CAF-AS, was created 
as follows. The 2.5 Kb EcoRI-Kpnl fragment containing the entire P/CAF open reading 
frame was isolated from pCX-P/CAF (43). This fragment was subcloned into the 
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EcoRI-Kpnl sites of plasmid pcDNA3 (Invitrogen) so that the antisense P/CAF RNA is 
driven under the CMV promoter. Reporter genes employed were 4RE-CAT and 
MCK-CAT (69). 4RE-CAT is driven by a synthetic promoter containing 4 copies of the 
E-box, whereas MCK-CAT is driven by the native MCK promoter (nucleotides -1256 to 
5 +7). 

Microinjection and immunofluorescence 

Cells were grown on small glass slides, subdivided into numbered squares of 2 
mm x 2 mm and microinjected with purified and concentrated antibodies, as previously 

10 described (70), For immunofluorescence, cells were fixed in either 2% 

paraformaldehyde or 1:2 methanol/acetone solution, preincubated with 5% BS A/PBS 
and incubated with the primary antibodies for 30 min at 37° C Subsequently, antibody 
was visualized by incubating with either rhodamine- or fluorescein-conjugated secondary 
antibody for 30 min at 37° C. Injected antibodies were stained with a 

15 rhodamine-conjugated secondary antibody and nuclei were counter-stained by DAPI as 
previously described (69). 

Antibodies employed are as follows; rabbit polyclonal affinity purified 
anti-P/CAF antibody (43), rabbit polyclonal anti-p300/CBP antiserum (71), mouse 
20 monoclonal anti-MyoD antibody (clone 5. 8 A, kindly provided by P. Houghton), goat 
polyclonal anti-c-Jun affinity purified antibody (Santa Cruz) and rabbit pre-immune 
serum. 

25 

Immunoprecipitation and DNA affinity purification 

Cells were resuspended in lysis buffer (20 mM NaP0 4 , 150 mM NaCl, 5mM 
MgCl 2 , 0. 1% NP40, 1 mM DTT, 10 mM sodium fluoride, 0. 1 mM sodium vanadate, 1 
mM phenylmethylsulfonyl-fluoride and 10 mg/ml each of leupeptin, aprotinin and 
30 pepstatin). After 30 min incubation on ice, samples were centrifuged at 12,000 x g for 
30 min and supernatants were used as cell extracts. Extracts were pre-cleared by 
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incubating with rabbit pre-immune serum and protein A/G Plus- Agarose (Santa Cruz) 
for 2 h at 4 C. For immunoprecipitation, the supernatants were incubated with the 
respective antibodies for 3 h at 4 C. Protein A/G Plus- Agarose was added, and 
incubation continued for 3 h. The matrix was washed with lysis buffer, then boiled in 2 
5 X SDS sample buffer. Immunoblotting was performed by using the ECL 

chemiluminescent detection kit (Amersham) according to the manufacturer's protocol 

Affinity purification of E-box-bound complexes was done as previously 
described (69). Briefly, 100 ng of the biotinylated double stranded DNA containing the 
10 E-box were immobilized on streptavidin-conjugated magnetic beads and incubated with 
500 mg of cell extracts in the presence of poly dl-dC. After extensive washing, bound 
proteins were eluted with SDS sample buffer and analyzed by immunoblotting 

In vitro protein-protein interaction assays 
1 5 The CBP-B fragment and its deletion derivatives were expressed as 

GST-fusions described previously (43). MyoD and El A (43) were expressed as 
FLAG-fusion proteins in Sf9 cells via a baculovirus expression system and 
affinity-purified on M2 anti-FLAG antibody-agarose (Kodak-IBI) Crude E coli 
extracts containing GST-fusions were incubated with various amounts of MyoD and/or 
20 El A in 50 ml of buffer B (20 mM Tris-HCl, pH 8.0, 0. 1 M KC1, 5 mM MgCl 2 , 10% 
glycerol, and 0.1% Nonidet P-40) on ice for 10 min. GST-precipitation was performed 
as described (43). MyoD and El A were detected by immunoblotting with anti-FLAG 
M2 antibody. For the interaction between P/CAF and MyoD, 1.5 pmol of 
FLAG-P/CAF and 15 pmol of FLAG-MyoD were incubated in 50 ml of buffer B on ice 
25 for 10 min. The mixture was further incubated with 2 mg of anti-P/CAF (43) or 
anti-hADA2 antibody for 60 min. The immunocomplexes were precipitated by 
incubation with 10 ml of protein A-Trisacryl (Pierce) and rotated for 1-4 hr at 4oC The 
matrix was washed 4 times with 200 ml of buffer B and boiled in 10 ml of 2 X SDS 
sample buffer. The proteins were resolved on a 4%-20% gradient SDS-PAGE and 
30 subjected to immunoblotting with the anti-FLAG M2 antibody The blot was developed 
with the SUPERSIGNAL chemiluminescent substrates (Pierce). 
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P/CAF coactivates muscle-specific transcription 

P/CAF and MyoD were co-transfected into mouse C3H10T1/2 fibroblasts, and 
MyoD-mediated transcription was determined from reporter activity driven by the 
5 artificial (4RE) and the naturally-occurring muscle creatine kinase (MCK) promoters. 
Overexpression of P/CAF stimulated MyoD-dependent transcription several folds in 
both promoters. Similar results were obtained for the myoD activated myogenin 
promoter Transcriptional activation was further stimulated by co-transfecting with 
MyoD, P/CAF and p300 expression vectors, suggesting that P/CAF may function by 

10 forming a complex with p300/CBP. Consistent with the lack of DNA binding capacity in 
P/CAF, overexpression of P/CAF alone did not increase the basal transcriptional activity 
of either enhancer. To test whether P/CAF and p300/CBP function in the same pathway, 
two dominant negative forms of p300 were employed which specifically inhibit 
p300/CBP-mediated transcription (60,64). The p300 segment spanning residues 

15 15 14-1922 inhibits the MyoD-dependent activation via direct interaction with MyoD 
(60), whereas the p300 segment spanning residues 1869-2414 inhibit it without direct 
interaction (64). Both dominant negative mutants inhibited MyoD-coactivation by 
P/CAF), suggesting that P/CAF and p300/CBP function in the same pathway. 

20 For further elucidation of the activation mechanism by P/CAF, the effect of El A, 

which inhibits MyoD-dependent transcription and differentiation (66,72,73) via direct 
interaction with p300/CBP (65,78), was tested. Expression of El A in C3H10T1/2 
fibroblasts inhibited stimulation of MyoD-directed transcription by P/CAF 
overexpression. El A mutants lacking p300/CBP-binding activity, El A D2-36 and El A 

25 R2G (67,79), had almost no effect. On the other hand, an El A mutant retaining 
p300/CBP-binding activity, El A D121-130, behaved like the wild type. Since El A 
associates with p300/CBP, but not with P/CAF, these results suggest that P/CAF 
functions in MyoD-directed transcription via interaction with p300/CBP. 

30 To address the role of P/CAF as a myogenic coactivator in a more relevant 

environment, P/CAF was overexpressed in proliferating C2C12 myoblasts which express 
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endogenous myogenic bHLH factors. As observed in fibroblasts, overexpression of 
P/CAF stimulated muscle specific transcription. Concomitant expression of exogenous 
p300 increased P/CAF-mediated coactivation. The repression exerted by wild type El A, 
but not mutant El A D2-36, on P/CAF coactivation of MyoD was also observed in 
5 muscle cells. 

Similar experiments were performed with myogenic cell lines that were stably 
transformed with wild type or mutant El A-expressing vectors (66). Coactivation by 
P/CAF was inhibited by wild type El A or the El A mutant that retains 
10 p300/CBP-binding activity (E1AA121-130). In contrast, E1A mutants that lack 

p300/CBP-binding (El A A2-36 and El A R2G) allowed transcriptional coactivation by 
P/CAF. Taken together, these experiments show that P/CAF coactivates MyoD-directed 
transcription via interaction with p300/CBP. 

1 5 P/CAF stimulates myogenic differentiation 

Given that P/CAF potentiates MyoD-directed transcription, the ability of P/CAF 
to assist MyoD in promoting myogenic differentiation was investigated. To this aim, 
C3H10T1/2 fibroblasts were transiently transfected with P/CAF and MyoD expression 
vectors. An expression vector for the green fluorescent protein (GFP) was 
20 co-transfected to identify transfected cells. After incubation in differentiation medium, 
the myogenic conversion of transfected cells was determined by simultaneous expression 
of the GFP and the differentiation-specific marker myosin heavy chain (MHC). Forced 
expression of MyoD in fibroblasts caused muscle differentiation in 12% of the 
transfected fibroblasts. This myogenic conversion was 20% by co-expressing MyoD and 
25 P/CAF. As observed in transcription experiments, stimulation of differentiation by 
P/CAF was counteracted by co-transfection with the p300 dominant negative mutant, 
p300 (1869-2414). Consistent with a general role for coactivators, overexpression of 
P/CAF alone was unable to differentiate fibroblasts. 

30 Similar experiments were done using proliferating C2C12 myoblasts in which the 

differentiation program is already committed. Most of the myoblasts differentiated into 



WO 98/03652 



PCT/US97/12877 



61 

myotubes by overexpressing P/CAF, whereas only a modest effect was observed by 
overexpressing p300. In contrast, differentiation was inhibited slightly by overexpressing 
c-Jun. This inhibitory effect presumably was caused by titration of p300/CBP, which 
associates directly with c-Jun (74). A similar inhibition was observed in the p300 
5 dominant negative mutant. Consistent with the transcriptional effect, El A almost 
completely inhibited differentiation. The El A mutant RG2, lacking p300/CBP-binding 
capability but retaining the retinoblastoma protein (Rb)-binding capability, only partially 
inhibited differentiation, although this same mutant 

inhibited transcription as severely as the wild type. Taken together, these data show that 
1 0 P/CAF stimulates muscle differentiation by coactivating MyoD function via p300/CBP 
association. 

P/CAF is essential for myogenic transcription and differentiation 

To test the necessity of P/CAF for myogenic transcription, experiments were 
1 5 conducted whereby P/CAF synthesis was inhibited by expressing antisense P/CAF RNA. 
A vector from which the P/CAF mRNA is transcribed in the antisense orientation 
(P/CAF- AS) was transfected with P/CAF and MyoD expression vectors into fibroblasts 
and MyoD-dependent transcription was examined. Cotransfection of the antisense 
expression vector strongly inhibited MyoD-dependent transcription below the level of 
20 induction elucidated by MyoD alone, demonstrating that expression of P/CAF antisense 
RNA inhibits not only the coactivation exerted by exogenous P/CAF but also that of 
endogenous P/CAF. These results indicate that P/CAF is essential for MyoD-dependent 
transcription. 

25 Studies were also carried out to determine whether expression of P/CAF 

antisense RNA inhibits myogenic differentiation. C3H10T1/2 fibroblasts were transiently 
transfected with various expression vectors with or without the P/CAF antisense RNA 
expression vector. Expression of P/CAF antisense RNA reduced MyoD-mediated 
myogenic conversion of fibroblasts. Expression of P/CAF antisense RNA also 

30 counteracted the stimulatory effect of both P/CAF and p300 on myogenic 

differentiation. These data support the view that P/CAF and p300/CBP coactivate 
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MyoD-dependent transcription in the same pathway. More drastic inhibition was 
observed in C2C12 myoblasts in similar experiments. Therefore, it can be concluded that 
P/CAF is essential for transcription of muscle specific genes and hence differentiation 
into myotubes. 

5 

To further confirm the essential role of P/CAF for myogenic differentiation, we 
blockage experiments by antibody microinjection were performed. Antibodies were 
injected into the cytoplasm of proliferating C2C12 myoblasts to prevent the nuclear 
transport of newly synthesized target proteins. After incubating in the differentiation 
10 medium, the degree of differentiation was determined. Microinjection of an anti-P/CAF 
antibody almost completely inhibited differentiation. Similar results were obtained by 
microinjecting anti-p300/CBP antibodies. Although microinjection of either 
anti-p300/CBP or P/CAF antibody was sufficient to inhibit differentiation, an even 
greater inhibition was observed by coinjecting both of them. Microinjection of 
1 5 anti-P/CAF or anti-p300/CBP antibody did not interfere with induction of p53 by DNA 
damaging agents, showing specificity of the inhibition by the antibodies. In contrast to 
anti-P/CAF or anti-p300/CBP antibodies, the injection of anti-MyoD antibody only 
partially inhibited differentiation, supporting the view of functional redundancy between 
MyoD and Myf-5 (75,76). Injection of anti-c-Jun antibody or control antibody did not 
20 interfere with muscle differentiation. 

Similar experiments were performed with C3H10T1/2 fibroblasts stably 
expressing MyoD. In these cells, either anti-p300/CBP or anti-P/CAF antibody 
completely inhibited muscle differentiation. In contrast to myoblasts, anti-MyoD 
25 antibody completely blocked differentiation in the fibroblasts expressing MyoD. 

Anti-c-Jun and control antibodies did not interfere with differentiation. Taken together, 
these results demonstrate that P/CAF and p300/CBP are indispensable for activation of 
the myogenic program. 



30 
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p300/CBP, P/CAF and MyoD form a multimeric complex in vivo 

The data described above indicate that P/CAF stimulates MyoD-directed 
transcription via association with p300/CBP. Thus, experiments were conducted to 
investigate whether P/CAF, p300/CBP and MyoD could associate in a complex. 
5 First, cellular extracts derived from C2C12 myotubes were subjected to 

immunoprecipitation. Both anti-MyoD and anti-p300/CBP antibodies co-precipitated 
P/CAF. In a complementary experiment, both anti-p300/CBP and anti-P/CAF 
antibodies also co-precipitated MyoD, suggesting that these factors form a multimeric 
protein complex in myotubes. 

10 

Next, attempts were made to detect this complex on the E-box, the DNA 
binding site for MyoD. Immobilized DNA containing an E-box sequence was incubated 
with myotube extracts. After extensive washing, P/CAF, p300/CBP and MyoD were 
analyzed by immunoblotting. P/CAF, p300/CBP and MyoD were all affinity purified on 
1 5 the immobilized DNA, whereas they were not purified on the control DNA lacking the 
E-box. Given that P/CAF and p300/CBP per se cannot bind to DNA, these observations 
indicate that P/CAF and p300/CBP are recruited through MyoD at the E-box sites to 
form a multi-protein complex. 

20 Complex formation is inhibited by viral transforming factors 

Since the oncoviral proteins El A and large T antigen inhibit myogenic 
transcription and differentiation, the effect of these factors on the formation of 
complexes on the E-box was tested. Importantly, very small amouts of P/CAF and 
p300/CBP were co-purified on the E-box from myocyte extracts which stably express 
25 El A or large T antigen, although MyoD was detected under these conditions. The lower 
recovery of MyoD from El A-expressing muscle cells could reflect the low level of 
MyoD in the extracts (66). These results indicate that El A and large T antigen 
dissociate P/CAF and p300/CBP from MyoD without altering MyoD binding to DNA. 

30 Consistent with the previous observations that transiently expressed El A 

prevents interaction between P/CAF and p300/CBP in vivo (43), the association 
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between p300/CBP and P/CAF was abolished in myoblasts stably transformed by wild 
type El A but not in those clones transformed with the El A mutant R2G unable to bind 
p300/CBP. Similarly, the interaction between p300/CBP and P/CAF was abolished by 
large T antigen but not by the mutant protein that localizes into the cytoplasm (77). 

5 

Interaction between MyoD, P/CAF and CBP in vitro 

Previous interaction experiments in vitro indicate that the CBP region spanning 
residues 1801 to 1850 is crucial for interaction with both P/CAF and El A (43). While 
most sequence-specific factors bind to CBP sites distinct from the P/CAF/E1 A binding 

10 sites, MyoD interacts with an overlapping CBP fragment called the CH3 region 

(60,64,65). To understand how P/CAF, p300/CBP and MyoD associate, the CBP sites 
important for MyoD binding were mapped more precisely. Consistent with previous 
reports (60,64,65), the CBP fragment spanning residues 1801-2000 (fragment B) bound 
MyoD. Moreover, deletion of residues 1801 to 1850 within fragment B completely 

15 abolished interaction with MyoD, which is similar to the results obtained with P/CAF 
and E1A. Importantly, an internal deletion of residues 1850-1878 abolished the MyoD 
interaction with CBP, while it did not affect binding of El A or P/CAF (43). These 
results suggest that MyoD and P/CAF bind to distinct sites of p300/CBP, albeit the 
binding sites may overlap. Moreover, a direct interaction was observed between MyoD 

20 and P/CAF, which may contribute to stabilization of the multimeric complex 

These data show that El A prevents not only p300/CBP-interaction with 
P/CAF but also that with MyoD in vivo. To obtain evidence that this 
inhibition is due to the direct action by El A, competition experiments were performed 
25 in vitro. Importantly, the interaction between CBP and MyoD was strongly inhibited by 
addition of El A, implicating that El A inhibits myogenic transcription by disrupting 
multiple interactions. 



Although the present process has been described with reference to specific 
30 details of certain embodiments thereof, it is not intended that such details should be 
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regarded as limitations upon the scope of the invention except as and to the extent that 
they are included in the accompanying claims. 

Throughout this application various publications are referenced by numbers 
5 within parentheses. Full citations for these publications are as follows. The disclosures 
of these publications in their entireties are hereby incorporated by reference into this 
application in order to more fully describe the state of the art to which this invention 
pertains. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT: The United States of America, as repesented by the 
Secretary, Department of Health and Human Services, c/o 
National Institutes of Health, Office of Technology Transfer, 
6011 Executive Boulevard, Suite 325, Rockville, Maryland 20842 

(ii) TITLE OF THE INVENTION: METHODS AND COMPOSITIONS FOR 

p300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF 

(iii) NUMBER OF SEQUENCES: 18 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NEEDLE & ROSENBERG, P.C. 

(B) STREET: Suite 1200, 127 Peachtree Street, NE 

(C) CITY: Atlanta 

(D) STATE: GA 

(E) COUNTRY: USA 

(F) ZIP: 30303 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 23-JUL-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: Corresponding U.S. Serial No. 60/022,27, 

(B) FILING DATE: 23-July-1996 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Miller, Mary L 

(B) REGISTRATION NUMBER: 39,303 

(C) REFERENCE /DOCKET NUMBER: 14014. 0238/P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404/688-0770 

(B) TELEFAX: 404/688-9880 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 832 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 
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(xi) SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO: 1: 










Met 


Ser 


Glu 


Ala 


Gly 


Gly 


Ala 


Gly 


Pro 


Gly 


Gly 


Cys 


ox _y 


r\A. a 


vjx y 


AT a 
r\J- a 


1 








5 










10 










X o 




Gly 


Ala 


Gly Ala 


Gly 


Pro 


Gly Ala 


Leu 


Pro 


Pro 


Gin 


Pro 


a 




Leu 






20 










25 










•j u 






Pro 


Pro 


Ala 


Pro 


Pro 


Gin 


Gly 


Ser 


Pro 


Cys 


Ala 


Ala 




Ala 


Gly 


Gl y 






35 










40 










H O 








Ser 


Gly 


Ala 


Cys 


Gly 


Pro 


Ala 


Thr 


Ala 


Val 


Ala 


Ala 


Til a 


vjx y 


1 III. 


J-1X ct 




50 










D o 










60 










Glu 


Gly 


Pro 


Gly 


Gly 


Gly 


Gly 


Ser 


Ala 


Arg 


He 


Ala 


V Q J. 


x»y s 


iiy s 


7\ 1 a 
-r-i—L a. 


65 










70 










75 












Gin 


Leu 


Arg 


Ser 


Ala 


Pro 


Arg 


J\x a 


Lys 


Lys 


Leu 


GlU 


Lys 


Leu 


fZ~\ XT 

vjxy 


vax 










85 










90 










y O 




Tyr 


Ser 


Ala 


Cys 


Lys 


Ala 


GlU 


Glu 


Ser 


Cys 


Lys 


Cys 


Asn 


vixy 


Trp 


Lys 








100 










105 










11U 






Asn 


Pro 


Asn 


Pro 


Ser 


Pro 


Thr 


Pro 


Pro 


Arg 


Ala 


Asp 


Leu 


UJ.I1 


v7X 11 


Tip 

lie 






115 










ion 
XZU 










1Z <J 








xxe 


Val 


Ser 


Leu 


Thr 


Glu 


Ser 


Cys 


Arg 


Ser 


Cys 


Ser 


Hi s 


Ala 


Leu 


Ala 




130 










*1 O C 

I30 










140 










AX a 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 


Glu 


Glu 


L7-L U 




A.s n 


Arg 


145 










150 










155 










1 so 

X D U 


Leu . 


Leu 


Gly 


xxe 


Val 


Leu 


Asp 


vax 


Glu 


Tyr 


Leu 


Phe 


inr 


Cys 


v ai 


Ui c 

nx 








165 










170 














Lys 


Glu 


Glu 


Asp 


Ala Asp 


Thr 


Lys 


Gin 


Val 


Tyr 


Phe 


Tyr 


Leu 


Dk o 
rile 


Lys 








18 0 










185 










i on 






Leu 


Leu 


Arg 


Lys 


Ser 


He 


Leu 


Gin 


Arg 


Gly 


Lys 


Pro 


val 


Va 1 
v aX 


<3 -L Ut 


Gly 






195 










2 00 










one, 








Se r 


Leu 


Glu 


Lys 


Lys 


Pro 


Pro 


Phe 


Glu 


Lys 


Pro 


Ser 


He 


Glu 


Gin 


Gly 




210 










Z 1 D 










220 










vax 


Asn 


Asn 


Phe 


Val 


Gin 


Tyr 


Lys 


Phe 


Ser 


His 


Leu 


Pro 


TV 1 p, 


Xiy s 


OX Li 


22 5 










230 










235 










o an 

Z *i L/ 


Arg 


Gin 


Thr 


lie 


Val 


Glu 


Leu 


Ala 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 




Asn 








245 










250 










zoo 




Tyr 


Trp 


His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Arg 


Arg 


Leu 


Arg 


Ser 


P ro 


Asn 






260 










265 










£ t U 






Asp 


Asp 


He 


Ser 


Gly 


Tyr 


Lys 


Glu 


Asn 


Tyr 


Thr 


Arg 


Trp 


Leu 


Cys 


Tyr 






275 










280 










Zoo 








Cys 


Asn 


Val 


Pro 


Gin 


Phe 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


\J-L M 


1 i IX. 


1 11 JL 




290 










295 










300 










Gin 


Val 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


i nr 


Val 


Mc L. 


Arg 


305 










310 










315 










Q ^ n 
oZ u 


Arg 


Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 








325 










330 










■sot 




Glu 


Lys 


Arg 


Thr 


Leu 


He 


Leu 


Thr 


His 


Phe 


Pro 


Lys 


Fne 


Leu 


Ser 








340 










345 










■2 C A 






Leu 


Glu 


Glu 


Glu 


Val 


Tyr 


Ser 


Gin 


Asn 


Ser 


Pre 


He 


Trp 


Asp 


bin 


Asp 






355 










360 










o & 
jOj 








Phe 


Leu 


Ser 


Ala 


Ser 


Ser 


Arg 


Thr 


Ser 


Gin 


Leu 


Gly 


X X c 


\3X 11 


J. iix 


V a X 




370 










375 










380 










lie 


Asn 


Pro 


Pro 


Pro 


Val 


Ala 


Gly 


Thr 


He 


Ser 


Tyr 


Asn 


Ser 


l nr 


Ser 


38 5 










390 










395 












Ser 


Ser 


Leu 


Glu 


Gin 


Pro 


Asn 


Ala 


Gly 


Ser 


Ser 


Ser 


Pro 


Ala 


Cys 


Lys 










405 










410 










415 




Ala 


Ser 


Ser 


Gly 


Leu 


Glu 


Ala 


Asn 


Pro 


Gly 


Glu 


Lys 


Arg 


Lys 


Met 


Thr 








420 










425 










430 






Asp 


Ser 


His 


Val 


Leu 


Glu 


Glu 


Ala 


Lys 


Lys 


Pro 


Arg 


Val 


Met 


Gly 


Asp 






435 










440 










445 








He 


Pro 


Met 


Glu 


Leu 


He 


Asn 


Glu 


Val 


Met 


Ser 


Thr 


He 


Thr 


Asp 


Pro 




450 










455 










460 










Ala 


Ala 


Met 


Leu 


Gly 


Pro 


Glu 


Thr 


Asn 


Phe 


Leu 


Ser 


Ala 


His 


Ser 


Ala 


465 










470 










475 










480 
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Arg Asp Glu Ala 

His Val Val Gly 
500 

Met Trp Leu Val 
515 

Met Pro Lys Glu 
530 

Thr Leu Ala Leu 
545 

Arg Met Phe Pro 

Thr Ser Asn Glu 
580 

Leu Lys Glu Tyr 
595 

Ala Asp Glu Tyr 
610 

Glu lie Lys lie 
625 

Glu Gly Ala Thr 

Thr Glu Phe Ser 
660 

Leu lie Glu Arg 
675 

Ser Cys Phe Lys 
690 

Gly lie Arg Glu 
705 

Glu Pro Arg Asp 

Gin Gin Val Lys 
740 

Lys Arg Thr Glu 
755 

Asp Leu Lys Thr 
770 

Lys Lys Leu Phe 
785 

Glu Tyr Asn Ala 

Glu Lys Phe Phe 
820 



Ala Arg Leu Glu 
485 

Asn Ser Leu Asn 

Gly Leu Gin Asn 
520 

Tyr lie Thr Arg 
535 

lie Lys Asp Gly 
550 

Ser Gin Gly Phe 
565 

Gin Val Lys Gly 

His He Lys His 
600 

Ala He Gly Tyr 
615 

Pro Lys Thr Lys 
630 

Leu Met Gly Cys 
645 

Val lie lie Lys 

Lys Gin Ala Gin 
680 

Asp Gly Val Arg 
695 

Thr Gly Trp Lys 
710 

Pro Asp Gin Leu 
725 

Ser His Gin Ser 

Ala Pro Gly Tyr 
760 

Met Ser Glu Arg 
775 

Met Ala Asp Leu 
790 

Pro Glu Ser Glu 
805 

Phe Ser Lys He 



Glu Arg Arg Gly 
490 

Gin Lys Pro Asn 
505 

Val Phe Ser His 

Leu Val Phe Asp 
540 

Arg Val He Gly 

555 

Thr Glu He Val 
570 

Tyr Gly Thr His 
585 

Asp He Leu Asn 

Phe Lys Lys Gin 
620 

Tyr Val Gly Tyr 
635 

Glu Leu Asn Pro 
650 

Lys Gin Lys Glu 
665 

He Arg Lys Val 

Gin He Pro He 

700 

Pro Ser Gly Lys 
715 

Tyr Ser Thr Leu 
730 

Ala Trp Pro Phe 
745 

Tyr Glu Val He 

Leu Lys Asn Arg 
780 

Gin Arg Val Phe 

795 

Tyr Tyr Lys Cys 
810 

Lys Glu Ala Gly 
825 



Val He Glu Phe 
495 

Lys Lys He Leu 
510 

Gin Leu Pro Arg 
525 

Pro Lys His Lys 

Gly He Cys Phe 
560 

Phe Cys Ala Val 

575 

Leu Met Asn His 
590 

Phe Leu Thr Tyr 

605 

Gly Phe Ser Lys 

He Lys Asp Tyr 
640 

Arg He Pro Tyr 
655 

He He Lys Lys 
670 

Tyr Pro Gly Leu 

685 

Glu Ser He Pro 

Glu Lys Ser Lys 
720 

Lys Ser He Leu 
735 

Met Glu Pro Val 
750 

Arg Ser Pro Met 
765 

Tyr Tyr Val Ser 

Thr Asn Cys Lys 
800 

Ala Asn He Leu 
815 

Leu He Asp Lys 
830 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Leu Glu Glu Glu Val Tyr Ser Gin Asn Ser Pro He Trp Asp Gin 

1 5 10 15 

Asp Phe Leu Ser Ala Ser Ser Arg Thr Ser Gin Leu Gly He Gin Thr 

20 25 30 

Val He Asn Pro Pro Pro Val Ala Gly Thr He Ser Tyr Asn Ser Thr 
35 40 45 
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Ser 


Ser 


Ser 


Leu 




o u 






Lys 


Ai. a. 


Ser 


Ser 










i nr 


Asp 


Ser 


ill s 


Asp 


lie 


Pro 


Met 








100 


Pro 






llC L- 






1 1 




a. 


Arg 


Asp 


LjX U 










Pne 


Hi s 


v ax 


v ai 


145 








Leu 


Met 


Trp 


Leu 


Arg 


Met 


Pro 


Lys 








180 


Lys 


i nr 


Leu 


J\±3, 






i y o 




rile 


Arg 


1 itt u 


true 










val 


l 11 a. 


C7 <z» y~ 
OCX. 


Asn 


O *> 








TJ -i r- 

rilS 


Leu 


Lys 


bill 


Tyr 


Ala 


Asp 


Glu 








260 


Lys 


Glu 


lie 


Lys 






i T c; 




Tyr 


("1,1 

bill 


oi y 


Hla 




o o n 

*i y u 






Tyr 


i nx 


k?l U 


IT 1 1 C 


one. 








Lys 


Leu 


l l e 


(jIU 


Leu 


Ser 


Cys 


Phe 








340 


Pro 


Gly 


lie 


Arg 






355 




Lys 


Glu 


Pro 


Arg 




370 






Leu 


Gin 


Gin 


val 


385 








Vdl 


Lys 


Arg 


i nr 


Met 


Asp 


Leu 


Lys 








420 


Ser 


Lys 


Lys 


Leu 






435 




Lys 


Glu 


Tyr 


Asn 




450 






Leu 


Glu 


Lys 


Phe 



465 
Lys 



LTl U 




r LU 


Mo 1 1 










y 


Leu 


'Jl u 


Zi 1 si 
ru. cl 




7 0 






Val 


Leu 


Glu 


Glu 


R S 








Glu 


Leu 


He 


Asn 


Leu 


Gly 


Pro 


GlU 








120 


Ala 


Ala 


Arg 


Leu 






X O J 




vjx y 


r\i 1 i 




Xje LI 










vdl 


Lxi y 


Leu 


ain 


lDJ 








V7X LI 


i yr 


Tip 


X ill 


Leu 


He 


Lys 


Asp 








200 


Pro 


Ser 


fil n 

vJXil 


v7i y 






PIS 




Glu 


Gin 


Val 


Lys 




Z. O \J 






Tyr 


XIX 5 


Tie 
11c 


Lys 










Tyr 


n 1 
>1X d 


Tip 

lie 


ui y 


He 


Pro 


Lys 


Thr 








280 


XIII. 


UC Li 


l lC L. 


ui y 






2 95 




Ser 


Val 


He 


He 




*J X KJ 






A.r g 


Xj y o 


V7111 




o Z. J 








Lys 


.Asp 


ui y 


val 


Glu 


Thr 


Gly 


Trp 








360 


Asp 


Pro 


Asp 


c^xn 










Lys 


Ser 


His 


GJ_n 




*3 q n 






VjIU 


iila 


Pro 


vjiy 










Thr 


Met 


Ser 


Glu 


Phe 


Met 


Ala 


Asp 








440 


Ala 


Pro 


Glu 


Ser 






455 




Phe 


Phe 


Ser 


Lys 




470 







73 



Ala 


Gly 


Ser 


Ser 








60 


Asn 


Pro 


Gly 


Glu 






7 5 




>T—1- CL 


Xjy o 


Lys 


rxu 




90 






Ul li 


val 




ser 


1UJ 








Thr 


er r-i 
rVd 11 


rue 


Le u 


Glu 


Glu 


Arg 


Arg 








140 


Asn 


uin 


Lys 


Pro 






IOj 




A.sn 


Val 


Phe 


Ser 




1 / u 






Arg 


Leu 


V Gl 


IT 11 " 


lOJ 








Gly 


>ii y 


Val 


He 


Phe 


Thr 


Glu 


He 








220 


«i y 


Tyr 


ui y 


X 111 






9 S 




Hi s 


Asp 


He 


Leu 




o ^ n 

Z jU 






iyr 


XrilC 


Lys 


Lys 


OCR 

Zoo 








Lys 


Tyr 


Val 


ui y 


Cys 


Glu 


Leu 


Asn 








300 


Xj y Z> 


uy o 


Ulli 


Xjy o 






315 




Gin 


He 


Arg 


Lys 




O J w 






Arg 




Tip 
X 1 C: 


Pro 


,5 ft o 








Lys 


Pro 


C £t T~ 

be! 


ui y 


Leu 


Tyr 


Ser 


Thr 








380 


Ser 


jrvia 


Trp 


Pro 






O Q C, 




Tyr 


Tyr 


bxU 


Vdl 




4 1 U 






Arg 


Leu 


Lys 


Asn 


425 








Leu 


Gin 


Arg 


Val 


Glu 


Tyr 


Tyr 


Lys 








460 


He 


Lys 


Glu 


Ala 






475 





OCl 


r i u 


Al Pi 
riX a 


Lyo 


Lys 


Arg 


Lys 


Met 








80 


Arg 


Val 


Met 


Gly 






95 




Thr 


T "1 
11C 


X 11 1 


Asp 




i i n 

1 1 U 






Ser 


Ala 


His 


Ser 


1 9=1 
1Z J 








Gly 


Val 


He 


Glu 


Asn 


Lys 


Lys 


He 








160 


rtl S 


Lrin 


Leu 


Pro 






1 / D 




Asp 


Pro 


Lys 


rli S 




ion 

iy u 






ul y 


oi y 


Tip 

lie 


Cys 


£. U J 








Val 


Phe 


Cys 


Ala 


His 


Leu 


Met 


Asn 








240 


Asn 


lr I ICS 


Leu 


X 11 X 






otic, 
t J J 




V7111 


uri y 




Ser 




Z / U 






Tyr 


T I c 

lie 


Lys 


Asp 


O O CL 

£. C3 O 








rxu 


■Mjt g 


Tip 
11C 


Pro 


Glu 


He 


He 


Lys 








320 


Val 


x yi 


Pro 


Gl y 






33 5 




He 


V^l Ll 


Ser 


lie 




ocn 
JOU 






Lys 


Ol LL 


Lys 


Ser 


■ice 
ODD 








Leu 


Lys 


Ser 


Tip 

lie 


Phe 


Met 


Glu 


Pro 








400 


He 


Arg 


Ser 


Pro 






415 




Arg 


Tyr 


Tyr 


Val 




430 






Phe 


Thr 


Asn 


Cys 


445 








Cys 


Ala 


Asn 


He 


Gly 


Leu 


He 


Asp 



480 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



Arg 


Val 


Val 


Gin 


His 


Thr 


Lys 


Gly 


Cys 


Lys 


Arg 


Lys 


Thr 


Asn 


\j± y 




1 








5 










10 










X D 




Cys 


Pro 


He 


Cys 


Lys 


Gin 


Leu 


lie 


Ala 


Leu 


Cys 


Cys 


Tyr 


mo 


Ala 


Lys 






20 










25 










*3 n 
o u 






His 


Cys 


Gin 


GlU 


Asn 


Lys 


Cys 


Pro 


Val 


Pro 


Phe 


Cys 


Leu 


Asn 


lie 


j-»ys 




35 










40 










45 








Gin 


Lys 


Leu 


Arg 


Gin 


Gin 


Gin 


Leu 


Gin 


His 


Arg 


Leu 


Gin 


Gin 


Ala 


Gin 




50 








55 










60 










Met 


Leu 


Arg 


Arg 


Arg 


Met 


Ala 


Ser 


Met 


Arg 


Thr 


Gly 


Val 


Val 


biy 


Lain 


65 






70 










75 










o U 


Gin 


Gin 


Gly 


Leu 


Pro 


Ser 


Pro 


Thr 


Pro 


Mia 




Pro 


■L 111. 




Pro 


Thr 






85 










90 










95 




Gly 


Gin 


Gin 


Pro 


Thr 


Thr 


Pro 


Gin 


Thr 


Pro 


Gin 


Pro 


Thr 


Ser 


Gin 


Pro 






100 










105 










110 






Gin 


Pro 


Thr 


Pro 


Pro 


Asn 


Ser 


Met 


Pro 


Pro 


Tyr 


Leu 


Pro 


Arg 


Thr 


Gin 






115 










120 










125 








Ala 


Ala 


Gly 


Pro 


Val 


Ser 


Gin 


Gly 


Lys 


Ala 


Ala 


Gly 


Gin 


Val 


Thr 


Pro 




130 








135 










140 










Pro 


Thr 


Pro 


Pro 


Gin 


Thr 


Ala 


Gin 


Pro 


Pro 


Leu 


Pro 


Gly 


Pro 


Pro 


Pro 


145 










150 










155 










160 


Thr 


Ala 


Val 


Glu 


Met 


Ala 


Met 


Gin 


He 


Gin 


Arg 


Ala 


Ala 


GlU 


Thr 


Gin 










165 










170 










175 




Arg 


Gin 


Met 


Ala 


His 


Val 


Gin 


He 


Phe 


Gin 


Arg 


Pro 


lie 


Gin 


His 


Gin 






180 










185 










190 






Met 


Pro 


Pro 


Met 


Thr 


Pro 


Met 


Ala 


Pro 


Met 


Gly 













195 200 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Ser Glu Ala Gly Gly Ala Gly Pro Gly Gly Cys Gly Ala Gly Ala 

15 10 15 

Gly Ala Gly Ala Gly Pro Gly Ala Leu Pro Pro Gin Pro Ala Ala Leu 

20 25 30 

Pro Pro Ala Pro Pro Gin Gly Ser Pro Cys Ala Ala Ala Ala Gly Gly 

35 40 45 

Ser Gly Ala Cys Gly Pro Ala Thr Ala Val Ala Ala Ala Gly Thr Ala 

50 55 60 

Glu Gly Pro Gly Gly Gly Gly Ser Ala Arg lie Ala Val Lys Lys Ala 
65 70 75 80 

Gin Leu Arg Ser Ala Pro Arg Ala Lys Lys Leu Glu Lys Leu Gly Val 

85 90 95 

Tyr Ser Ala Cys Lys Ala Glu Glu Ser Cys Lys Cys Asn Gly Trp Lys 

100 105 HO 

Asn Pro Asn Pro Ser Pro Thr Pro Pro Arg Ala Asp Leu Gin Gin He 

115 120 125 

He Val Ser Leu Thr Glu Ser Cys Arg Ser Cys Ser His Ala Leu Ala 
130 135 140 
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Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 


Glu 


Glu 


Glu 


Met 


Asn 


Arg 


n y| c 










1 J u 










_L _) -J 










160 


Leu 


Leu 


C 1 T T 

vj.Ly 


lie 


Val 
iuJ 


Leu 


Asp 


v ax 


m 1 1 

Ul u 


Tyr 
17 0 


Leu 


Phe 


Thr 


Cys 


Val 
175 


His 


Lys 


Ijrl U 


1 U. 


Asp 
ion 




Asp 


Th r 

X 11 A. 


Ly s 


m n 

w X 1 1 

loo 


v a x 


fur 


Phe 


Tvr 

j. y J. 


Leu 

1 ^ u 


Phe 


Lys 


Leu 


Leu 


Arg 

iy j 


Lys 


Ser 


He 


Leu 


Gin 

\J \J 


Arg 


Gly 


Lys 


Pro 


Val 

9 0 S 
^ w -j 


Val 


Glu 


Gly 


Ser 


Leu 

9 t n 


Glu 


Lys 


Lys 


Pro 


Pro 
215 


Phe 


Glu 


Lys 


Pro 


Ser 
22 0 


He 


Glu 


Gin 


Gly 


Val 


Asn 


Asn 


xriie 


v a x 




r P\/ r* 
i yi 


Lys 




Ser 


His 


Leu 


Pro 


Ala 


Lys 


Glu 


O O R 

*£ZO 










9^0 










9 *3 s 
j j 










9 4 0 


Arg 




inr 


lie 


V a. J. 


url U 


Leu 


a 


Lys 


l ie L, 




L€U 


Mail 


Zl rrf 


Tip 
J. -L vT 


As n 








9 a ^ 

^1 1 3 










^ JU 










9 S S 
4* j j 




Tyr 


Trp 


tlx s 


Leu 

zi bU 


r;i , , 
ul u 




Pro 


r- 


o^lli 

Oct 
ZOO 


Arg 


Arg 


Leu 


21 t~ rt 

•r-l-L. y 


Ser 

9 7 0 


Pro 


As n 


Asp 


Asp 


He 


Ser 


triy 


Tyr 


Lys 


ijrl U 


Asn 


Tyr 


i nx 


Arg 


i rp 

9 ft R 
ZD J 


Leu 




i yr 


Cys 


Asn 


Val 


Pro 


Gin 


Phe 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


290 










295 










300 










Gin 


Val 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 


305 










310 










315 










320 


Arg. 


Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 








325 










330 










335 




Glu 


Lys 


Arg 


Thr 


Leu 


He 


Leu 


Thr 


His 


Phe 


Pro 


Lys 


Phe 


Leu 


Ser 






340 










345 










350 







(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 476 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met 


Leu 


Glu 


Glu 


Glu 


He 


Tyr 


Gly 


Ala 


Asn 


Ser 


Pro 


lie 


Trp 


Glu 


Ser 


1 








5 










10 










15 




Gly 


Phe 


Thr 


Met 


Pro 


Pro 


Ser 


Glu 


Gly 


Thr 


Gin 


Leu 


Val 


Pro 


Arg 


Pro 






20 










25 










30 






Ala 


Ser 


Val 


Ser 


Ala 


Ala 


Val 


Val 


Pro 


Ser 


Thr 


Pro 


He 


Phe 


Ser 


Pro 






35 










40 










45 








Ser 


Met 


Gly 


Gly 


Gly 


Ser 


Asn 


Ser 


Ser 


Leu 


Ser 


Leu 


Asp 


Ser 


Ala 


Gly 




50 










55 










60 










Ala 


Glu 


Pro 


Met 


Pro 


Gly 


Glu 


Lys 


Arg 


Thr 


Leu 


Pro 


Glu 


Asn 


Leu 


Thr 


65 










70 










75 










80 


Leu 


Glu 


Asp 


Ala 


Lys 


Arg 


Leu 


Arg 


Val 


Met 


Gly 


Asp 


lie 


Pro 


Met 


Glu 










85 










90 










95 




Leu 


Val 


Asn 


Glu 


Val 


Met 


Leu 


Thr 


He 


Thr 


Asp 


Pro 


Ala 


Ala 


Met 


Leu 








100 










105 










110 






Gly 


Pro 


Glu 


Thr 


Ser 


Leu 


Leu 


Ser 


Ala 


Asn 


Ala 


Ala 


Arg 


Asp 


Glu 


Thr 






115 










120 










125 








Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


He 


He 


Glu 


Phe 


His 


Val 


He 


Gly 




130 










135 










140 










Asn 


Ser 


Leu 


Thr 


Pro 


Lys 


Ala 


Asn 


Arg 


Arg 


Val 


Leu 


Leu 


Trp 


Leu 


Val 


145 










150 










155 










160 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 


Glu 










165 










170 










175 




Tyr 


He 


Ala 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


Lys 


Thr 


Leu 


Ala 


Leu 








180 










185 










190 
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He Lys Asp Gly Arg Val He Gly Gly He Cys Phe Arg Met Phe Pro 

195 200 205 

Thr Gin Gly Phe Thr Glu He Val Phe Cys Ala Val Thr Ser Asn Glu 

210 215 220 

Gin Val Lys Gly Tyr Gly Thr His Leu Met Asn His Leu Lys Glu Tyr 
225 230 235 240 

His He Lys His Asn He Leu Tyr Phe Leu Thr Tyr Ala Asp Glu Tyr 

245 250 255 

Ala He Gly Tyr Phe Lys Lys Gin Gly Phe Ser Lys Asp He Lys Val 

260 265 270 

Pro Lys Ser Arg Tyr Leu Gly Tyr He Lys Asp Tyr Glu Gly Ala Thr 

275 280 285 

Leu Met Glu Cys Glu Leu Asn Pro Arg He Pro Tyr Thr Glu Leu Ser 

290 295 300 

His He He Lys Lys Gin Lys Glu He He Lys Lys Leu He Glu Arg 
305 310 315 320 

Lys Gin Ala Gin He Arg Lys Val Tyr Pro Gly Leu Ser Cys Phe Lys 

325 330 335 

Glu Gly Val Arg Gin He Pro Val Glu Ser Val Pro Gly He Arg Glu 

340 345 350 

Thr Gly Trp Lys Pro Leu Gly Lys Glu Lys Gly Lys Glu Leu Lys Asp 

355 360 365 

Pro Asp Gin Leu Tyr Thr Thr Leu Lys Asn Leu Leu Ala Gin lie Lys 

370 375 380 

Ser His Pro Ser Ala Trp Pro Phe Met Glu Pro Val Lys Lys Ser Glu 
385 390 395 400 

Ala Pro Asp Tyr Tyr Glu Val He Arg Phe Pro He Asp Leu Lys Thr 

405 410 415 

Met Thr Glu Arg Leu Arg Ser Arg Tyr Tyr Val Thr Arg Lys Leu Phe 

420 425 430 

Val Ala Asp Leu Gin Arg Val He Ala Asn Cys Arg Glu Tyr Asn Pro 

435 440 445 

Pro Asp Ser Glu Tyr Cys Arg Cys Ala Ser Ala Leu Glu Lys Phe Phe 

450 455 460 

Tyr Phe Lys Leu Lys Glu Gly Gly Leu He Asp Lys 
465 470 475 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2414 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Glu Asn Val Val Glu Pro Gly Pro Pro Ser Ala Lys Arg Pro 

15 10 15 

Lys Leu Ser Ser Pro Ala Leu Ser Ala Ser Ala Ser Asp Gly Thr Asp 

20 25 30 

Phe Gly Ser Leu Phe Asp Leu Glu His Asp Leu Pro Asp Glu Leu He 

35 40 45 

Asn Ser Thr Glu Leu Gly Leu Thr Asn Gly Gly Asp He Asn Gin Leu 

50 55 60 

Gin Thr Ser Leu Gly Met Val Gin Asp Ala Ala Ser Lys His Lys Gin 
65 70 75 80 

Leu Ser Glu Leu Leu Arg Ser Gly Ser Ser Pro Asn Leu Asn Met Gly 

85 90 95 

Val Gly Gly Pro Gly Gin Val Met Ala Ser Gin Ala Gin Gin Ser Ser 
100 105 HO 
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Pro 


Gly 


Leu 
x x o 


Gly 


Leu 


He 


Asn 


Ser 

X 4L. U 


Ala 


Gly 
IjU 


Leu 


Thr 


Ser 


Pro 


Asn 


Met 


Gin 


Gly 


Pro 


Thr 


Gin 


Ser 


Thr 


Gly 










x o u 






Pro 


ax a 


rie l. 


oxy 


ne l. 

IDJ 


Asn 


1 IlX 


OX y 


Met 


Leu 


Ala 


Ala 
lou 


Gly 


Asn 


Gly 


Gin 


Asn 


Gly 


Ser 
i y o 


lie 


Gly 


Ala 


Gly 


Arg 

^UU 


Asn 


Pro 
210 


Gly 


Met 


Gly 


Ser 


Ala 


Gly 


Gin 


Gly 


Ser 


Pro 


Gin 


Met 


Gly 


Gly 


225 
















Pro 


Leu 


Lys 


Met 


Gly 

o ^ 


Met 


Met 


Asn 


Tyr 


Thr 


Gin 


Asn 

o Gs n 


Pro 


Gly 


Gin 


Gin 


Gin 


He 


Gin 

O "7 ^ 


Thr 


Lys 


Thr 


val 


Leu 

O fi fi 
Z u U 


Met 


Asp 

O Q A 

zyu 


Lys 


Lys 


Ala 


Val 


Pro 


Gly 


Gin 


Pro 


Ala 


Pro 


Gin 


Val 


Gin 


Gin 


305 
















oxn 


C 1 XT 

ox y 




o-x y 


OCX. 


ui y 


TV 1 a 




Leu 


He 


Gin 


Gin 


Gin 


Leu 


Val 


Leu 


Arg 


Arg 


Glu 

*} ^ ^ 


Gin 


Ala 


Asn 


Gly 


Glu 


Cys 


Arg 

J f u 


Thr 


Met 


Lys 


Asn 


Val 

,3 / O 


Leu 


Gly 


Lys 


Ser 


Cys 


Gin 


Val 


Ala 


His 


385 










o o n 

o y u 






Ser 


His 


Trp 


Lys 


Asn 
405 


Cys 


i nr 


Arg 


Leu 


Lys 


Asn 


Ala 

42 0 


Gly 


Asp 


Lys 


Arg 


Ala 


Pro 


Val 
4 35 


Gly 


Leu 


Gly 


Asn 


Pro 
440 


Ser 


Ala 
450 


Pro 


Asn 


Leu 


Ser 


Thr 

yl c. 


Val 


Glu 


Arg 


Ala 


Tyr 


Ala 


Ala 


Leu 


Gly 


4 65 










a ^ n 

H / U 






Pro 


Thr 


Gin 


Pro 


Gin 
4 85 


val 


Gin 


TV T — . 

Ala 


Gly 


Gin 


Ser 


Pro 

jUU 


Gin 


Gly 


Met 


Arg 


Pro 


Met 


Gly 

O 1 D 


val 


Asn 


Gly 


Gly 


Val 

Ron 
oz u 


Ser 


Asp 
530 


Ser 


Met 


Leu 


His 


Ser 

C tJ c 


Ala 


Ser 


Glu 


Asn 


Ala 


Ser 


Val 


Pro 


Ser 


545 










550 






Gin 


Pro 


Ser 


Thr 


Thr 
565 


Gly 


lie 


Arg 


Gin 


Asp 


Leu 


Arg 
580 


Asn 


His 


Leu 


Val 


Pro 


Thr 


Pro 
595 


Asp 


Pro 


Ala 


Ala 


Leu 
600 



77 



171 tS U 


val 


Lys 


OCX 


riu 


IvTpf 


Thr 


Gin 










12 5 








Gl y 


Met 


Gly 


Thr 


Ser 


Gl v 


Pro 


Asn 








140 














Asn 


Ser 


Pro 


Val 


Asn 


Gin 
















160 


Thr 


As n 


Ala 


Gly 


Met 


Asn 


Pro 


Gly 




X / \s 










X / -J 




OX _y 


He 


Met 


Pro 


Asn 


Gin 


Val 


Met 


X O J 










19 0 






o x y 


./ax y 


Gin 


Asp 


Met 


ox ii 


i yx 


Pro 










9 n ^ 

ZU J 








Asn 


Xieu 


Leu 


i nr 


oXU 


Pro 


Leu 


bin 








9 n 










OX 11 


1 11 X 


ox y 


Leu 


Arg 


ox y 


13 r~rv 
ir X vJ 


OX 1 1 






ZvDJ 










Zl fl 
Z1U 


Asn 


Pro 


Asn 


Pro 


Tyr 


oX y 


Ser 


Pro 




ZjU 










9 s s 

Z J J 




He 


Gly 


Ala 


Ser 


Gly 


Leu 


Glv 


Leu 


2 65 










27 0 






Ser 


A.sn 


As n 


Leu 


Ser 


Pro 


Phe 


Ala 










285 








ox y 


ox _y 


Mp f 
1 1C c 


Pro 


As n 


Met 


Gly 


Gin 








o w v 










Pro 


vjxy 


Leu 


V CLX 


X 11 X 




val 


rVX Ct 






O X J 










7 Cl 


JL 11 X 


riXa 


Asp 


Pro 

t X u 


OX LI 


Lys 


Arg 


Lys 














i. ^ s 




Leu 


Leu 


His 


Ala 


His 


Lys 


Cys 


Gin 


^ zi *s 










35 0 






V^ 1 

V Cl X 


/"YX. y 


Gin 


Cys 


Asn 


Leu 


Pro 


Hi s 










3 65 








Asn 


XIX o 




± Hi. 


His 


Cys 


Gin 


Ser 








380 










cys 


J~V,X cl 


OC 1 


Ser 


r\S- wj 


Gin 


He 


He 
















zi nri 


HXS 


Asp 


Cys 


Pro 


v ax 


Cys 


Leu 


Pro 




a *i n 
4XU 










H ID 




Asn 


Gin 


oxn 


Pro 


xxe 


Leu 


mu •»- 

l nr 


bi y 


/IOC, 










a n 






Ser 


Ser 


Leu 


Gly 


vax 


ox y 


bin 


bin 










ZI J ^ 
fl fi D 








Ser 


bin 


xxe 


Asp 


Pro 


Ser 


Ser 


Tic 

xxe 








a n 










Leu 


Pro 


Tyr 


Lrlll 


V dl 


As n 


OX 11 


Met- 
ric U 
















a n 
sou 


Lys 


As n 


oxn 


oxn 


Asn 


oXIl 


OX 11 


Pro 


^ on 










Zl Q R 




Jtr X \J 




Ser 


As n 


Met 


Ser 


Ala 


Ser 


JU J 
















ox y 


val 


OX 11 


1 11 X 


riu 


OCX 


Leu 


Leu 










j> -j 








He 


Asn 


Ser 


Gin 


Asn 


Pro 


Met 


Met 








540 










Leu 


Gly 


Pro 


Met 


Pro 


Thr 


Ala 


Ala 






555 










560 


Lys 


Gin 


Trp 


His 


Glu 


Asp 


He 


Thr 




570 










575 




His 


Lys 


Leu 


Val 


Gin 


Ala 


He 


Phe 


585 










590 






Lys 


Asp 


Arg 


Arg 


Met 


Glu 


Asn 


Leu 



605 
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Val Ala Tyr Ala Arg Lys Val Glu Gly Asp Met Tyr Glu Ser Ala Asn 

610 615 620 

Asn Arg Ala Glu Tyr Tyr His Leu Leu Ala Glu Lys lie Tyr Lys He 
625 630 635 640 

Gin Lys Glu Leu Glu Glu Lys Arg Arg Thr Arg Leu Gin Lys Gin Asn 

645 650 655 

Met Leu Pro Asn Ala Ala Gly Met Val Pro Val Ser Met Asn Pro Gly 

660 665 670 

Pro Asn Met Gly Gin Pro Gin Pro Gly Met Thr Ser Asn Gly Pro Leu 

675 680 685 

Pro Asp Pro Ser Met He Arg Gly Ser Val Pro Asn Gin Met Met Pro 

690 695 700 

Arg lie Thr Pro Gin Ser Gly Leu Asn Gin Phe Gly Gin Met Ser Met 
705 710 715 720 

Ala Gin Pro Pro He Val Pro Arg Gin Thr Pro Pro Leu Gin His His 

725 730 735 

Gly Gin Leu Ala Gin Pro Gly Ala Leu Asn Pro Pro Met Gly Tyr Gly 

740 745 750 

Pro Arg Met Gin Gin Pro Ser Asn Gin Gly Gin Phe Leu Pro Gin Thr 

755 760 765 

Gin Phe Pro Ser Gin Gly Met Asn Val Thr Asn He Pro Leu Ala Pro 

770 775 780 

Ser Ser Gly Gin Ala Pro Val Ser Gin Ala Gin Met Ser Ser Ser Ser 
785 790 795 800 

cys Pro Val Asn Ser Pro He Met Pro Pro Gly Ser Gin Gly Ser His 

805 810 815 

He His Cys Pro Gin Leu Pro Gin Pro Ala Leu His Gin Asn Ser Pro 

820 825 830 

Ser Pro Val Pro Ser Arg Thr Pro Thr Pro His His Thr Pro Pro Ser 

835 840 845 

He Gly Ala Gin Gin Pro Pro Ala Thr Thr lie Pro Ala Pro Val Pro 

850 855 860 

Thr Pro Pro Ala Met Pro Pro Gly Pro Gin Ser Gin Ala Leu His Pro 
865 870 875 880 

Pro Pro Arg Gin Thr Pro Thr Pro Pro Thr Thr Gin Leu Pro Gin Gin 

885 890 895 

Val Gin Pro Ser Leu Pro Ala Ala Pro Ser Ala Asp Gin Pro Gin Gin 

900 905 910 

Gin Pro Arg Ser Gin Gin Ser Thr Ala Ala Ser Val Pro Thr Pro Asn 

915 920 925 

Ala Pro Leu Leu Pro Pro Gin Pro Ala Thr Pro Leu Ser Gin Pro Ala 

930 935 940 

Val Ser He Glu Gly Gin Val Ser Asn Pro Pro Ser Thr Ser Ser Thr 
945 950 955 960 

Glu Val Asn Ser Gin Ala He Ala Glu Lys Gin Pro Ser Gin Glu Val 

965 970 975 

Lys Met Glu Ala Lys Met Glu Val Asp Gin Pro Glu Pro Ala Asp Thr 

980 985 990 

Gin Pro Glu Asp He Ser Glu Ser Lys Val Glu Asp Cys Lys Met Glu 

995 1000 1005 

Ser Thr Glu Thr Glu Glu Arg Ser Thr Glu Leu Lys Thr Glu He Lys 

1010 1015 1020 

Glu Glu Glu Asp Gin Pro Ser Thr Ser Ala Thr Gin Ser Ser Pro Ala 
025 1030 1035 1040 

Pro Gly Gin Ser Lys Lys Lys He Phe Lys Pro Glu Glu Leu Arg Gin 

1045 1050 1055 

Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg Gin Asp Pro Glu Ser 

1060 1065 1070 

Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu Leu Gly lie Pro Asp 

1075 1080 1085 

Tyr Phe Asp He Val Lys Ser Pro Met Asp Leu Ser Thr He Lys Arg 
1090 1095 1100 
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Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp Gin Tyr Val Asp Asp 
105 1110 1115 1120 

lie Trp Leu Met Phe Asn Asn Ala Trp Leu Tyr Asn Arg Lys Thr Ser 

1125 1130 1135 

Arg Val Tyr Lys Tyr Cys Ser Lys Leu Ser Glu Val Phe Glu Gin Glu 

1140 1145 1150 

lie Asp Pro Val Met Gin Ser Leu Gly Tyr Cys Cys Gly Arg Lys Leu 

1155 1160 1165 

Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly Lys Gin Leu Cys Thr 

1170 1175 1180 

lie Pro Arg Asp Ala Thr Tyr Tyr Ser Tyr Gin Asn Arg Tyr His Phe 
185 1190 1195 1200 

Cys Glu Lys Cys Phe Asn Glu lie Gin Gly Glu Ser Val Ser Leu Gly 

1205 1210 1215 

Asp Asp Pro Ser Gin Pro Gin Thr Thr lie Asn Lys Glu Gin Phe Ser 

1220 1225 1230 

Lys Arg Lys Asn Asp Thr Leu Asp Pro Glu Leu Phe Val Glu Cys Thr 

1235 1240 1245 

Glu Cys Gly Arg Lys Met His Gin He Cys Val Leu His His Glu He 

1250 1255 1260 

He Trp Pro Ala Gly Phe Val Cys Asp Gly Cys Leu Lys Lys Ser Ala 
265 1270 1275 1280 

Arg Thr Arg Lys Glu Asn Lys Phe Ser Ala Lys Arg Leu Pro Ser Thr 

1285 1290 1295 

Arg Leu Gly Thr Phe Leu Glu Asn Arg Val Asn Asp Phe Leu Arg Arg 

1300 1305 1310 

Gin Asn His Pro Glu Ser Gly Glu Val Thr Val Arg Val Val His Ala 

1315 1320 1325 

Ser Asp Lys Thr Val Glu Val Lys Pro Gly Met Lys Ala Arg Phe Val 

1330 1335 1340 

Asp Ser Gly Glu Met Ala Glu Ser Phe Pro Tyr Arg Thr Lys Ala Leu 
345 1350 1355 1360 

Phe Ala Phe Glu Glu lie Asp Gly Val Asp Leu Cys Phe Phe Gly Met 

1365 1370 1375 

His Val Gin Glu Tyr Gly Ser Asp Cys Pro Pro Pro Asn Gin Arg Arg 

1380 1385 1390 

Val Tyr He Ser Tyr Leu Asp Ser Val His Phe Phe Arg Pro Lys Cys 

1395 1400 1405 

Leu Arg Thr Ala Val Tyr His Glu He Leu He Gly Tyr Leu Glu Tyr 

1410 1415 1420 

Val Lys Lys Leu Gly Tyr Thr Thr Gly His He Trp Ala Cys Pro Pro 
425 1430 1435 1440 

Ser Glu Gly Asp Asp Tyr He Phe His Cys His Pro Pro Asp Gin Lys 

1445 1450 1455 

lie Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr Lys Lys Met Leu Asp 

1460 1465 1470 

Lys Ala Val Ser Glu Arg He Val His Asp Tyr Lys Asp He Phe Lys 

1475 1480 1485 

Gin Ala Thr Glu Asp Arg Leu Thr Ser Ala Lys Glu Leu Pro Tyr Phe 

1490 1495 1500 

Glu Gly Asp Phe Trp Pro Asn Val Leu Glu Glu Ser He Lys Glu Leu 
505 1510 1515 1520 

Glu Gin Glu Glu Glu Glu Arg Lys Arg Glu Glu Asn Thr Ser Asn Glu 

1525 1530 1535 

Ser Thr Asp Val Thr Lys Gly Asp Ser Lys Asn Ala Lys Lys Lys Asn 

1540 1545 1550 

Asn Lys Lys Thr Ser Lys Asn Lys Ser Ser Leu Ser Arg Gly Asn Lys 

1555 1560 1565 

Lys Lys Pro Gly Met Pro Asn Val Ser Asn Asp Leu Ser Gin Lys Leu 

1570 1575 1580 

Tyr Ala Thr Met Glu Lys His Lys Glu Val Phe Phe Val He Arg Leu 
585 1590 1595 1600 
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He Ala Gly Pro Ala Ala Asn Ser Leu Pro Pro He Val Asp Pro Asp 

1605 1610 1615 

Pro Leu He Pro Cys Asp Leu Met Asp Gly Arg Asp Ala Phe Leu Thr 

1620 1625 1630 

Leu Ala Arg Asp Lys His Leu Glu Phe Ser Ser Leu Arg Arg Ala Gin 

1635 1640 1645 

Trp Ser Thr Met Cys Met Leu Val Glu Leu His Thr Gin Ser Gin Asp 

1650 1655 1660 

Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys His His Val Glu Thr Arg 
665 1670 1675 1680 

Trp His Cys Thr Val Cys Glu Asp Tyr Asp Leu Cys He Thr Cys Tyr 

1685 1690 1695 

Asn Thr Lys Asn His Asp His Lys Met Glu Lys Leu Gly Leu Gly Leu 

1700 1705 1710 

Asp Asp Glu Ser Asn Asn Gin Gin Ala Ala Ala Thr Gin Ser Pro Gly 

1715 1720 1725 

Asp Ser Arg Arg Leu Ser He Gin Arg Cys He Gin Ser Leu Val His 

1730 1735 1740 

Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser Leu Pro Ser Cys Gin Lys 
745 1750 1755 1760 

Met Lys Arg Val Val Gin His Thr Lys Gly Cys Lys Arg Lys Thr Asn 

1765 1770 1775 

Gly Gly Cys Pro He Cys Lys Gin Leu He Ala Leu Cys Cys Tyr His 

1780 1785 1790 

Ala Lys His Cys Gin Glu Asn Lys Cys Pro Val Pro Phe Cys Leu Asn 

1795 1800 1805 

He Lys Gin Lys Leu Arg Gin Gin Gin Leu Gin His Arg Leu Gin Gin 

1810 1815 1820 

Ala Gin Met Leu Arg Arg Arg Met Ala Ser Met Gin Arg Thr Gly Val 
825 1830 1835 1840 

Val Gly Gin Gin Gin Gly Leu Pro Ser Pro Thr Pro Ala Thr Pro Thr 

1845 1850 1855 

Thr Pro Thr Gly Gin Gin Pro Thr Thr Pro Gin Thr Pro Gin Pro Thr 

I860 1865 1870 

Ser Gin Pro Gin Pro Thr Pro Pro Asn Ser Met Pro Pro Tyr Leu Pro 

1875 1880 1885 

Arg Thr Gin Ala Ala Gly Pro Val Ser Gin Gly Lys Ala Ala Gly Gin 

1890 1895 1900 

Val Thr Pro Pro Thr Pro Pro Gin Thr Ala Gin Pro Pro Leu Pro Gly 
905 1910 1915 1920 

Pro Pro Pro Thr Ala Val Glu Met Ala Met Gin He Gin Arg Ala Ala 

1925 1930 1935 

Glu Thr Gin Arg Gin Met Ala His Val Gin He Phe Gin Arg Pro He 

1940 1945 1950 

Gin His Gin Met Pro Pro Met Thr Pro Met Ala Pro Met Gly Met Asn 

1955 I960 1965 

Pro Pro Pro Met Thr Arg Gly Pro Ser Gly His Leu Glu Pro Gly Met 

1970 1975 1980 

Gly Pro Thr Gly Met Gin Gin Gin Pro Pro Trp Ser Gin Gly Gly Leu 
985 1990 1995 2000 

Pro Gin Pro Gin Gin Leu Gin Ser Gly Met Pro Arg Pro Ala Met Met 

2005 2010 2015 

Ser Val Ala Gin His Gly Gin Pro Leu Asn Met Ala Pro Gin Pro Gly 

2020 2025 2030 

Leu Gly Gin Val Gly He Ser Pro Leu Lys Pro Gly Thr Val Ser Gin 

2035 2040 2045 

Gin Ala Leu Gin Asn Leu Leu Arg Thr Leu Arg Ser Pro Ser Ser Pro 

2050 2055 2060 

Leu Gin Gin Gin Gin Val Leu Ser He Leu His Ala Asn Pro Gin Leu 
065 2070 2075 2080 

Leu Ala Ala Phe He Lys Gin Arg Ala Ala Lys Tyr Ala Asn Ser Asn 
2085 2090 2095 
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Pro Gin Pro lie Pro Gly Gin Pro Gly Met Pro Gin Gly Gin Pro Gly 

2100 2105 2110 

Leu Gin Pro Pro Thr Met Pro Gly Gin Gin Gly Val His Ser Asn Pro 

2115 2120 2125 

Ala Met Gin Asn Met Asn Pro Met Gin Ala Gly Val Gin Arg Ala Gly 

2130 2135 2140 

Leu Pro Gin Gin Gin Pro Gin Gin Gin Leu Gin Pro Pro Met Gly Gly 
145 2150 2155 2160 

Met Ser Pro Gin Ala Gin Gin Met Asn Met Asn His Asn Thr Met Pro 

2165 2170 2175 

Ser Gin Phe Arg Asp lie Leu Arg Arg Gin Gin Met Met Gin Gin Gin 

2180 2185 2190 

Gin Gin Gin Gly Ala Gly Pro Gly lie Gly Pro Gly Met Ala Asn His 

2195 2200 2205 

Asn Gin Phe Gin Gin Pro Gin Gly Val Gly Tyr Pro Pro Gin Pro Gin 

2210 2215 2220 

Gin Arg Met Gin His His Met Gin Gin Met Gin Gin Gly Asn Met Gly 
225 2230 2235 2240 

Gin lie Gly Gin Leu Pro Gin Ala Leu Gly Ala Glu Ala Gly Ala Ser 

2245 2250 2255 

Leu Gin Ala Tyr Gin Gin Arg Leu Leu Gin Gin Gin Met Gly Ser Pro 

2260 2265 2270 

Val Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu Pro Asn Gin 

2275 2280 2285 

Ala Gin Ser Pro His Leu Gin Gly Gin Gin lie Pro Asn Ser Leu Ser 

2290 2295 2300 

Asn Gin Val Arg Ser Pro Gin Pro Val Pro Ser Pro Arg Pro Gin Ser 
305 2310 2315 2320 

Gin Pro Pro His Ser Ser Pro Ser Pro Arg Met Gin Pro Gin Pro Ser 

2325 2330 2335 

Pro His His Val Ser Pro Gin Thr Ser Ser Pro His Pro Gly Leu Val 

2340 2345 2350 

Ala Ala Gin Ala Asn Pro Met Glu Gin Gly His Phe Ala Ser Pro Asp 

2355 2360 2365 

Gin Asn Ser Met Leu Ser Gin Leu Ala Ser Asn Pro Gly Met Ala Asn 

2370 2375 2380 

Leu His Gly Ala Ser Ala Thr Asp Leu Gly Leu Ser Thr Asp Asn Ser 
385 2390 2395 2400 

Asp Leu Asn Ser Asn Leu Ser Gin Ser Thr Leu Asp lie His 
2405 2410 2 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2441 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Ala Glu Asn Leu Leu Asp Gly Pro Pro Asn Pro Lys Arg Ala Lys 

15 10 15 

Leu Ser Ser Pro Gly Phe Ser Ala Asn Asp Asn Thr Asp Phe Gly Ser 

20 25 30 

Leu Phe Asp Leu Glu Asn Asp Leu Pro Asp Glu Leu lie Pro Asn Gly 

35 40 45 

Glu Leu Ser Leu Leu Asn Ser Gly Asn Leu Val Pro Asp Ala Ala Ser 

50 55 60 

Lys His Lys Gin Leu Ser Glu Leu Leu Arg Gly Gly Ser Gly Ser Ser 
65 70 75 80 
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He Asn Pro Gly He Gly Asn Val Ser Ala Ser Ser Pro Val Gin Gin 

85 90 95 

Gly Leu Gly Gly Gin Ala Gin Gly Gin Pro Asn Ser Thr Asn Met Ala 

100 105 HO 

Ser Leu Gly Ala Met Gly Lys Ser Pro Leu Asn Gin Gly Asp Ser Ser 

115 120 125 

Thr Pro Asn Leu Pro Lys Gin Ala Ala Ser Thr Ser Gly Pro Thr Pro 

130 135 140 

Pro Ala Ser Gin Ala Leu Asn Pro Gin Ala Gin Lys Gin Val Gly Leu 
145 150 155 160 

Val Thr Ser Ser Pro Ala Thr Ser Gin Thr Gly Pro Gly He Cys Met 

165 170 175 

Asn Ala Asn Phe Asn Gin Thr His Pro Gly Leu Leu Asn Ser Asn Ser 

180 185 190 

Gly His Ser Leu Met Asn Gin Ala Gin Gin Gly Gin Ala Gin Val Met 

195 200 205 

Asn Gly Ser Leu Gly Ala Ala Gly Arg Gly Arg Gly Ala Gly Met Pro 

210 215 220 

Tyr Pro Ala Pro Ala Met Gin Gly Ala Thr Ser Ser Val Leu Ala Glu 
225 230 235 240 

Thr Leu Thr Gin Val Ser Pro Gin Met Ala Gly His Ala Gly Leu Asn 

245 250 255 

Thr Ala Gin Ala Gly Gly Met Thr Lys Met Gly Met Thr Gly Thr Thr 

260 265 270 

Ser Pro Phe Gly Gin Pro Phe Ser Gin Thr Gly Gly Gin Gin Met Gly 

275 280 285 

Ala Thr Gly Val Asn Pro Gin Leu Ala Ser Lys Gin Ser Met Val Asn 

290 295 300 

Ser Leu Pro Ala Phe Pro Thr Asp He Lys Asn Thr Ser Val Thr Thr 
305 310 315 320 

Val Pro Asn Met Ser Gin Leu Gin Thr Ser Val Gly He Val Pro Thr 

325 330 335 

Gin Ala He Ala Thr Gly Pro Thr Ala Asp Pro Glu Lys Arg Lys Leu 

340 345 350 

He Gin Gin Gin Leu Val Leu Leu Leu His Ala His Lys Cys Gin Arg 

355 360 365 

Arg Glu Gin Ala Asn Gly Glu Val Arg Ala Cys Ser Leu Pro His Cys 

370 375 380 

Arg Thr Met Lys Asn Val Leu Asn His Met Thr His Cys Gin Ala Pro 
385 390 395 400 

Lys Ala Cys Gin Val Ala His Cys Ala Ser Ser Arg Gin He He Ser 

405 410 415 

His Trp Lys Asn Cys Thr Arg His Asp Cys Pro Val Cys Leu Pro Leu 

420 425 430 

Lys Asn Ala Ser Asp Lys Arg Asn Gin Gin Thr He Leu Gly Ser Pro 

435 440 445 

Ala Ser Gly He Gin Asn Thr He Gly Ser Val Gly Ala Gly Gin Gin 

450 455 460 

Asn Ala Thr Ser Leu Ser Asn Pro Asn Pro He Asp Pro Ser Ser Met 
465 470 475 480 

Gin Arg Ala Tyr Ala Ala Leu Gly Leu Pro Tyr Met Asn Gin Pro Gin 

485 490 495 

Thr Gin Leu Gin Pro Gin Val Pro Gly Gin Gin Pro Ala Gin Pro Pro 

500 505 510 

Ala His Gin Gin Met Arg Thr Leu Asn Ala Leu Gly Asn Asn Pro Met 

515 520 525 

Ser Val Pro Ala Gly Gly He Thr Thr Asp Gin Gin Pro Pro Asn Leu 

530 535 540 

He Ser Glu Ser Ala Leu Pro Thr Ser Leu Gly Ala Thr Asn Pro Leu 
545 550 555 560 

Met Asn Asp Gly Ser Asn Ser Gly Asn He Gly Ser Leu Ser Thr He 
565 570 575 
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Ser Gin Ser Thr Ser Pro Ser Gin Pro Arg Lys Lys lie Phe Lys Pro 

1075 1080 1085 

Glu Glu Leu Arg Gin Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg 

1090 1095 1100 

Gin Asp Pro Glu Ser Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu 
105 HIO 1115 1120 

Leu Gly lie Pro Asp Tyr Phe Asp lie Val Lys Asn Pro Met Asp Leu 

1125 1130 1135 

Ser Thr lie Lys Arg Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp 

1140 1145 1150 

Gin Tyr Val Asp Asp Val Arg Leu Met Phe Asn Asn Ala Trp Leu Tyr 

1155 H60 1165 

Asn Arg Lys Thr Ser Arg Val Tyr Lys Phe Cys Ser Lys Leu Ala Glu 

1170 H75 1180 

Val Phe Glu Gin Glu lie Asp Pro Val Met Gin Ser Leu Gly Tyr Cys 
185 H90 1195 1200 

Cys Gly Arg Lys Tyr Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly 

1205 1210 1215 

Lys Gin Leu Cys Thr lie Pro Arg Asp Ala Ala Tyr Tyr Ser Tyr Gin 

1220 1225 1230 

Asn Arg Tyr His Phe Cys Gly Lys Cys Phe Thr Glu lie Gin Gly Glu 

1235 1240 1245 

Asn Val Thr Leu Gly Asp Asp Pro Ser Gin Pro Gin Thr Thr lie Ser 

1250 1255 1260 

Lys Asp Gin Phe Glu Lys Lys Lys Asn Asp Thr Leu Asp Pro Glu Pro 
265 1270 1275 1280 

Phe Val Asp Cys Lys Glu Cys Gly Arg Lys Met His Gin lie Cys Val 

1285 1290 1295 

Leu His Tyr Asp lie lie Trp Pro Ser Gly Phe Val Cys Asp Asn Cys 

1300 1305 1310 

Leu Lys Lys Thr Gly Arg Pro Arg Lys Glu Asn Lys Phe Ser Ala Lys 

1315 1320 1325 

Arg Leu Gin Thr Thr Arg Leu Gly Asn His Leu Glu Asp Arg Val Asn 

1330 1335 1340 

Lys Phe Leu Arg Arg Gin Asn His Pro Glu Ala Gly Glu Val Phe Val 
345 1350 1355 1360 

Arg Val Val Ala Ser Ser Asp Lys Thr Val Glu Val Lys Pro Gly Met 

1365 1370 1375 

Lys Ser Arg Phe Val Asp Ser Gly Glu Met Ser Glu Ser Phe Pro Tyr 

1380 1385 1390 

Arg Thr Lys Ala Leu Phe Ala Phe Glu Glu lie Asp Gly Val Asp Val 

1395 1400 1405 

Cys Phe Phe Gly Met His Val Gin Asp Thr Ala Leu lie Ala Pro His 

1410 1415 1420 

Gin He Gin Gly Cys Val Tyr He Ser Tyr Leu Asp Ser He His Phe 
425 1430 1435 1440 

Phe Arg Pro Arg Cys Leu Arg Thr Ala Val Tyr His Glu He Leu He 

1445 1450 1455 

Gly Tyr Leu Glu Tyr Val Lys Lys Leu Val Tyr Val Thr Ala His He 

1460 1465 1470 

Trp Ala Cys Pro Pro Ser Glu Gly Asp Asp Tyr He Phe His Cys His 

1475 1480 1485 

Pro Pro Asp Gin Lys He Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr 

1490 1495 1500 

Lys Lys Met Leu Asp Lys Ala Phe Ala Glu Arg He He Asn Asp Tyr 
505 1510 1515 1520 

Lys Asp He Phe Lys Gin Ala Asn Glu Asp Arg Leu Thr Ser Ala Lys 

1525 1530 1535 

Glu Leu Pro Tyr Phe Glu Gly Asp Phe Trp Pro Asn Val Leu Glu Glu 

1540 1545 1550 

Ser He Lys Glu Leu Glu Gin Glu Glu Glu Glu Arg Lys Lys Glu Glu 
1555 1560 1565 
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Ser Thr Ala Ala Ser Glu Thr Pro Glu Gly Ser Gin Gly Asp Ser Lys 

1570 1575 1580 

Asn Ala Lys Lys Lys Asn Asn Lys Lys Thr Asn Lys Asn Lys Ser Ser 
585 1590 1595 1600 

lie Ser Arg Ala Asn Lys Lys Lys Pro Ser Met Pro Asn Val Ser Asn 

1605 1610 1615 

Asp Leu Ser Gin Lys Leu Tyr Ala Thr Met Glu Lys His Lys Glu Val 

1620 1625 1630 

Phe Phe Val lie His Leu His Ala Gly Pro Val lie Ser Thr Gin Pro 

1635 1640 1645 

Pro lie Val Asp Pro Asp Pro Leu Leu Ser Cys Asp Leu Met Asp Gly 

1650 1655 1660 

Arg Asp Ala Phe Leu Thr Leu Ala Arg Asp Lys His Trp Glu Phe Ser 
665 1670 1675 1680 

Ser Leu Arg Arg Ser Lys Trp Ser Thr Leu Cys Met Leu Val Glu Leu 

1685 1690 1695 

His Thr Gin Gly Gin Asp Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys 

1700 1705 1710 

His His Val Glu Thr Arg Trp His Cys Thr Val Cys Glu Asp Tyr Asp 

1715 1720 1725 

Leu Cys lie Asn Cys Tyr Asn Thr Lys Ser His Thr His Lys Met Val 

1730 1735 1740 

Lys Trp Gly Leu Gly Leu Asp Asp Glu Gly Ser Ser Gin Gly Glu Pro 
745 1750 1755 1760 

Gin Ser Lys Ser Pro Gin Glu Ser Arg Arg Leu Ser lie Gin Arg Cys 

1765 1770 1775 

lie Gin Ser Leu Val His Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser 

1780 1785 1790 

Leu Pro Ser Cys Gin Lys Met Lys Arg Val Val Gin His Thr Lys Gly 

1795 1800 1805 

Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys Lys Gin Leu lie 

1810 1815 1820 

Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu Asn Lys Cys Pro 
825 1830 1835 1840 

Val Pro Phe Cys Leu Asn lie Lys His Asn Val Arg Gin Gin Gin lie 

1845 1850 1855 

Gin His Cys Leu Gin Gin Ala Gin Leu Met Arg Arg Arg Met Ala Thr 

1860 1865 1870 

Met Asn Thr Arg Asn Val Pro Gin Gin Ser Leu Pro Ser Pro Thr Ser 

1875 1880 1885 

Ala Pro Pro Gly Thr Pro Thr Gin Gin Pro Ser Thr Pro Gin Thr Pro 

1890 1895 1900 

Gin Pro Pro Ala Gin Pro Gin Pro Ser Pro Val Asn Met Ser Pro Ala 
905 1910 1915 1920 

Gly Phe Pro Asn Val Ala Arg Thr Gin Pro Pro Thr lie Val Ser Ala 

1925 1930 1935 

Gly Lys Pro Thr Asn Gin Val Pro Ala Pro Pro Pro Pro Ala Gin Pro 

1940 1945 1950 

Pro Pro Ala Ala Val Glu Ala Ala Arg Gin lie Glu Arg Glu Ala Gin 

1955 1960 1965 

Gin Gin Gin His Leu Tyr Arg Ala Asn lie Asn Asn Gly Met Pro Pro 

1970 1975 1980 

Gly Arg Asp Gly Met Gly Thr Pro Gly Ser Gin Met Thr Pro Val Gly 
985 1990 1995 2000 

Leu Asn Val Pro Arg Pro Asn Gin Val Ser Gly Pro Val Met Ser Ser 

2005 2010 2015 

Met Pro Pro Gly Gin Trp Gin Gin Ala Pro lie Pro Gin Gin Gin Pro 

2020 2025 2030 

Met Pro Gly Met Pro Arg Pro Val Met Ser Met Gin Ala Gin Ala Ala 

2035 2040 2045 

Val Ala Gly Pro Arg Met Pro Asn Val Gin Pro Asn Arg Ser lie Ser 
2050 2055 2060 
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Ser 



Pro Ser Ala Leu Gin Asp Leu Leu Arg Thr Leu Lys Ser Pro Ser S« 
065 2070 2075 2080 

Pro Gin Gin Gin Gin Gin Val Leu Asn lie Leu Lys Ser Asn Pro Gin 

2085 2090 2095 

Leu Met Ala Ala Phe lie Lys Gin Arg Thr Ala Lys Tyr Val Ala Asn 

2100 2105 2110 

Gin Pro Gly Met Gin Pro Gin Pro Gly Leu Gin Ser Gin Pro Gly Met 

2115 2120 2125 

Gin Pro Gin Pro Gly Met His Gin Gin Pro Ser Leu Gin Asn Leu Asn 

2130 2135 2140 

Ala Met Gin Ala Gly Val Pro Arg Pro Gly Val Pro Pro Pro Gin Pro 
145 2150 2155 2160 

Ala Met Gly Gly Leu Asn Pro Gin Gly Gin Ala Leu Asn lie Met Asn 

2165 2170 2175 

Pro Gly His Asn Pro Asn Met Thr Asn Met Asn Pro Gin Tyr Arg Glu 

2180 2185 2190 

Met Val Arg Arg Gin Leu Leu Gin His Gin Gin Gin Gin Gin Gin Gin 

2195 2200 2205 

Gin Gin Gin Gin Gin Gin Gin Gin Asn Ser Ala Ser Leu Ala Gly Gly 

2210 2215 2220 

Met Ala Gly His Ser Gin Phe Gin Gin Pro Gin Gly Pro Gly Gly Tyr 
225 2230 2235 2240 

Ala Pro Ala Met Gin Gin Gin Arg Met Gin Gin His Leu Pro lie Gin 

2245 2250 2255 

Gly Ser Ser Met Gly Gin Met Ala Ala Pro Met Gly Gin Leu Gly Gin 

2260 2265 2270 

Met Gly Gin Pro Gly Leu Gly Ala Asp Ser Thr Pro Asn He Gin Gin 

2275 2280 2285 

Ala Leu Gin Gin Arg He Leu Gin Gin Gin Gin Met Lys Gin Gin He 

2290 2295 2300 

Gly Ser Pro Gly Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu 
305 2310 2315 2320 

Ser Gly Gin Pro Gin Ala Ser His Leu Pro Gly Gin Gin He Ala Thr 

2325 2330 2335 

Ser Leu Ser Asn Gin Val Arg Ser Pro Ala Pro Val Gin Ser Pro Arg 

2340 2345 2350 

Pro Gin Ser Gin Pro Pro His Ser Ser Pro Ser Pro Arg He Gin Pro 

2355 2360 2365 

Gin Pro Ser Pro His His Val Ser Pro Gin Thr Gly Thr Pro His Pro 

2370 2375 2380 

Gly Leu Ala Val Thr Met Ala Ser Ser Met Asp Gin Gly His Leu Gly 
385 2390 2395 2400 

Asn Pro Glu Gin Ser Ala Met Leu Pro Gin Leu Asn Thr Pro Asn Arg 

2405 2410 2415 

Ser Ala Leu Ser Ser Glu Leu Ser Leu Val Gly Asp Thr Thr Gly Asp 

2420 2425 2430 

Thr Leu Glu Lys Phe Val Glu Gly Leu 
2435 2440 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ala Glu Ala Gly Gly Ala Gly Ser Pro Ala Leu Pro Pro Ala Pro 
15 10 15 
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>i_L a 


Arg 


x xe 


>11 d 


V dl 


Lys 


Lys 


Al 3 
.rti d 


ox 1 1 


Leu 




DO 










3 3 










ft n 
0 u 










Arg 


oer 


J-\J- a 


Pro 


Arg 


a l ^ 


Lys 


Lys 


Leu 


Glu 


Lys 


Leu 


Gly 


Val 


fur 
x y x 


Ser 


DO 










n n 
/ u 




















8 0 


Ala 


Cys 


Lys 


J\JL a 




p 1 n 
LrX u 




Cys 


i*y s 


Cys 


As n 


Gly 


X XLJ 


Lys 


As n 


Pro 








ft R 
a 3 
























Asn 


Pro 


Ser 


Pro 


i nr 


Pro 


Pro 


Arg 


OX y 


A.sp 


lteu 


V7XI1 


Gin 


Tip 
X X c 


Tip 
X X c 


Val 








1UU 










1UO 










1 1 n 
11U 






Ser 


Leu 


Tnr 


CjIU 


Ser 


Cys 


Arg 


Ser 


Cys 


Ser 


flX o 


Al a 
/VI d 


Leu 


Ala 


Al A 

M.1 d 


Hi c: 
m 0 






115 










Iz U 










Iz 3 








Val 


Ser 


His 


Leu 


Glu 


Asn 


v ai 


Ser 


pi t * 

olU 


pi n 

LjIU 


L7IU 


JYie u 


Asp 


Arg 


Leu 


Leu 




130 










toe 










1 /in 
1 4 U 










Gly 


He 


val 


Leu 


Asp 


Val 


m t i 

Li 


Tyr 


Leu 




1 11 1 




Va 1 
v d 1 


H -i «; 
n 1 £> 


Lys 


m 1 1 


1 /l c 










1 Rfl 
1 JU 










1 

1 -J w> 










X D V 


Glu 


Asp 


Aid 


Asp 


i nr 


Lys 


p 1 n 


v dl 


Tyr 


X~ 11C 


i yr 


Leu 




Ly s 


Leu 
































175 




Arg 


Tire? 


Ser 


Tic 

lie 


Leu 




Arg 


wi y 


iiy s 


xr j_ u 


1 

v ax 


Val 


Glu 


Gly 


Ser 


Leu 








1 on 
1 o u 










X O J 










190 








Ly s 


Lys 


Pro 


Pro 


XT 11C 


wX LI 


Xjy O 


JT X w 


Ser 


lie 


Glu 


Gin 


Gly 


Val 


Asn 




i y o 










Z U U 










z U J 








Asn 


Pne 


vai 


Gin 


Tyr 


Lys 


trne 


Ser 


ill s 


Leu 


Pro 


JCl 




Glu 


A rrr 


Gl n 
ox 1 1 




210 










Z ID 










O O fl 

z z u 










Thr 


i nr 


lie 


olU 


Leu 


./-VI d 


Lys 


1 iC L* 


irl ic 


Leu 


As n 


j\r g 


He 


As n 


fur 
x y 1 


x ip 


O O CL 




















7 ^ s 

Z. O «J 












His 


Leu 


LrlU 


Hid 


Pro 


Ser 




Arg 


j\e g 


lieu 


A m 


Ser 


Pro 


As n 


Asp 


Asp 




















250 










255 




He 


Ser 


oiy 


Tyr 


Lys 


fin 

IjiU 


Asn 


Tyr 


X 111 


Arg 


Trp 


T .^1 1 
XjC IX 


fire 


1 yi 


Cys 


Asn 








o & n 










9 S 










97 0 

£. i \J 






val 


Pro 


bin 


rile 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


m 11 

V7l LL 


X 11X 


X 11X 


i»y s 


Va 1 
v ax 






Z / 3 










*? ft n 










9 R S 
to j 








Phe 


Gly 


Arg 


inr 


Leu 


Leu 


Arg 


Ser 


Val 


xr ne 


rp U r- 

i nr 


Tip 

lie 




Arg 


Arg 


VJX 11 




290 










O Q R 

^yD 




















Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Lys 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 


pi ,< 
Lriu 


Lys 


305 










J 1 U 










0 1 0 










1 0 n 
oz u 


Arg 


Thr 


Leu 


He 


Leu 


Thr 


His 


rue 


Pro 


Lys 


fne 


Leu 


Ser 


ncL 


Leu 


p ") 1 1 
(jl u 








32 5 










"3 O A 
J jU 










Q *3 R 




Glu 


Glu 


Val 


Tyr 


Ser 


Gin 


Asn 


Ser 


Pro 


lie 


Trp 


Asp 


bin 


Asp 


DK a 

fne 


Leu 








340 










■3 /i c 

o40 










ODU 






Ser 


Ala 


Ser 


Ser 


Arg 


Thr 


Ser 


Pro 


Leu 


vxiy 


lie 


pi « 

vrrin 


inr 


Val 


lie 


Ser 






355 










o c n 










■sec 

JO J 








Pro 


Pro 


Val 


Thr 


Gly 


Thr 


Ala 


Leu 


Phe 


O y-w -y- 

ser 


Ser 


Asn 


Ser 


1 nr 


Ser 


ill s 




37 0 








375 










JOU 










Glu 


P T n 

Gin 


lie 


Asn 


Gly 


Gly Arg 


i nr 


Ser 


Pro 


rz~\ \r 
ul y 


Cys 


A.r g 


p, i , 7 

ui y 




Ser 


O O R 
O O 3 










390 










J ^ w) 










d no 


P 1 \ r 
(jXy 


Leu 


tjlU 


>vi a 


As n 


Pro 


Gly 


OX Ll 




IV y~ rr 
^vx. y 


Xiy d 


Met 


As n 


Asn 


Ser 


His 








H U D 










A 1 A 
*4 1 \J 










H 1 -J 




Ala 


Pro 


blU 


Glu 


Ai a 


Lys 


Arg 


Ser 


Arg 


val 


1YIC L- 


01 y 


Asp 


Tip 
lie 


Pro 


Va 1 
v a x 








a "y n 
4ZU 










AO ^ 










ft jU 






p"l 11 

vjX Li 


Le u 


lie 


A*s n 


Glu 


Val 


Met 


Ser 


Thr 


lie 


Thr 


Asp 


Pro 


Ala 


Gl v 


Met 






435 










440 










445 








Leu 


Gly 


Pro 


Glu 


Thr 


Asn 


Phe 


Leu 


Ser 


Ala 


His 


Ser 


Ala 


Arg 


Asp 


Glu 




450 










455 










460 










Ala 


Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


Val 


lie 


Glu 


Phe 


His 


Val 


Val 


465 










470 










475 










480 


Gly 


Asn 


Ser 


Leu 


Asn 


Gin 


Lys 


Pro 


Asn 


Lys 


Lys 


He 


Leu 


Met 


Trp 


Leu 










485 










490 










495 




Val 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 



500 505 510 
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Glu Tyr lie Thr 
515 

Leu lie Lys Asp 
530 

Pro Ser Gin Gly 
545 

Glu Gin Val Lys 

Tyr His lie Lys 
580 

Tyr Ala He Gly 
595 

He Pro Lys Thr 
610 

Thr Leu Met Gly 
625 

Ser Val He He 

Arg Lys Gin Ala 
660 

Lys Asp Gly Val 
675 

Glu Thr Gly Trp 
690 

Asp Pro Glu His 
705 

Lys Asn His Pro 

Glu Ala Pro Gly 
740 

Thr Met Ser Glu 
755 

Phe Met Ala Asp 
770 

Pro Pro Glu Ser 
785 

Phe Phe Ser Lys 



Arg Leu Val Phe 
520 

Gly Arg Val He 
535 

Phe Thr Glu He 
550 

Gly Tyr Gly Thr 
565 

His Glu He Leu 

Tyr Phe Lys Lys 
600 

Lys Tyr Val Gly 
615 

Cys Glu Leu Asn 
630 

Lys Lys Gin Lys 
645 

Gin He Arg Lys 

Arg Gin He Pro 
680 

Lys Pro Ser Gly 
695 

Val Tyr Ser Thr 
710 

Asn Ala Trp Pro 

725 

Tyr Tyr Glu Val 

Arg Leu Arg Asn 
760 

Leu Gin Arg Val 
775 

Glu Tyr Tyr Lys 
790 

He Lys Glu Ala 
805 



Asp Pro Lys His 

Gly Gly He Cys 
540 

Val Phe Cys Ala 
555 

His Leu Met Asn 
570 

Asn Phe Leu Thr 
585 

Gin Gly Phe Ser 

Tyr He Lys Asp 
620 

Pro Gin He Pro 
635 

Glu He He Lys 
650 

Val Tyr Pro Gly 
665 

He Glu Ser He 

Lys Glu Lys Ser 
700 

Leu Lys Asn He 

715 

Phe Met Glu Pro 
730 

He Arg Phe Pro 
745 

Arg Tyr Tyr Val 

Phe Thr Asn Cys 
780 

Cys Ala Ser lie 
795 

Gly Leu He Asp 
810 



Lys Thr Leu Ala 
525 

Phe Arg Met Phe 

Val Thr Ser Asn 
560 

His Leu Lys Glu 
575 

Tyr Ala Asp Glu 
590 

Lys Glu He Lys 
605 

Tyr Glu Gly Ala 

Tyr Thr Glu Phe 
640 

Lys Leu He Glu 
655 

Leu Ser Cys Phe 
670 

Pro Gly lie Arg 
685 

Lys Glu Pro Lys 

Leu Gin Gin Val 
720 

Val Lys Arg Thr 

735 

Met Asp Leu Lys 
750 

Ser Lys Lys Leu 
765 

Lys Glu Tyr Asn 

Leu Glu Lys Phe 
800 

Lys 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

His Thr Lys Gly Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys 

1 5 10 15 

Lys Gin Leu He Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu 

20 25 30 

Asn Lys Cys Pro Val Pro Phe Cys Leu Asn He Lys His Asn Val Arg 
35 40 45 

Gin Gin 
50 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2204 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACCCACTCCC CCCAGAGCCG ACCTGCAGCA AATAATTGTC AGTCTAACAG AATCCTGTCG 60 

GAGTT GTAGC CAT GC C CT AG CTGCTCATGT TTCCCACCTG GAGAATGTGT CAGAGGAAGA 12 0 

AATGAACAGA CTCCTGGGAA TAGTATTGGA TGTGGAATAT CTCTTTACCT GTGTCCACAA 18 0 

GGAAGAAGAT GCAGATACCA AACAAGTTTA TTTCTATCTA TTTAAGCTCT TGAGAAAGTC 24 0 

TATTTTACAA AGAGGAAAAC CTGTGGTTGG AAGGCTCTTT GGAAAAGAAA CCCCCATTTG 30 0 

AAAAACCTAG CATTGAACAG GGTGTGAATA ACTTTGTGCA GTACAAATTT AGTCACCTGC 360 

CAGCAAAAAG AAAGGCAAAC CAATAGTTGA GTT GGCAAAA ATGTTCCTAA ACCGCATCAC 42 0 

CTATT GGCAT CTGGAGGCAC CAT CT C AAC G AGACTGCGAT CT CCAAT GAT GATATTCTGG 4 80 

ATACAAAGAG AACTACACAA GGTGGCTGTG TTACTGCAAC GTGCCACAGT TCTGCGACAG 54 0 

TCTACCTCGG TACGAAACCA CACAGGTGTT T GGGAGAAC A TCGTTCGCTC GGTCTTCACT 60 0 

GTTATGAGGC GACAACTCCT GGAACAAGCA AGACAGGAAA AAGATAAACT GCCTCTTGAA 660 

AAACGAACTC TAATCCTCAC TCATTTCCCA AAATTTCTGT C C AT GCTAGA AGAAGAAGTA 72 0 

TATAGTCAAA ACTCTCCCAT CTGGGATCAC CATTTTCTCT CAGCCTCTTC CAGAACCAGC 78 0 

CAGC TAGGCA TCCAAACAGT TATCAATCAC CTCCTGTGGC TGGGACAATT TCATACAATT 84 0 

CAACCTCATC TTCCCTTGAG CAGCCAAACG GAG GGAGC AG CAGTCCTGCC TGCAAAGCCT 900 

CTTCTGGACT TGAGGCAAAC CCAGGAGAAA AGAGGAAAAT GACTGATTCT CATGTTCTGG 960 

AGGAGGCCAA GAAACCCCGA GTTATGGGGG ATATTCCGAT GGAATTAATC AAC GAGGT T A 102 0 

T GTCTAC CAT CACGGACCCT GCAGCAATGC TTGGACCAGA GACCAATTTT CTGTCAGCAC 108 0 

ACTCGGCCAG GGATGAGGCG GCAAGGTTGG AAGAGCGCAG GGGT GTAATT GAATTTCACG 114 0 

TGGTTGGCAA TTCCCTCAAC CAGAAACCAA ACAAGAAGAT CCTGATGTGG CTGGTTGGCC 12 00 

TACAGAACGT TTTCTCCCAC CAGCTGCCCC GAATGCCAAA AGAAT AC AT C ACACGGCTCG 1260 

TCTTTGACCC GAAACACAAA ACCCTTGCTT TAATTAAAGA TGGCCGTGTT ATTGGTGGTA 132 0 

TCTGTTTCCG TATGTTCCCA T CT CAAGGAT TCACAGAGAT TGTCTTCTGT GCTGTAACCT 13 8 0 

CAAAT GAGCA AGTCAAGGGC TAT GGAAC AC ACCT GATGAA TCATTTGAAA GAATAT C AC A 144 0 

TAAAGCATGA CATCCTGAAC TTCCTCACAT ATGCAGATGA AT AT GCAATT GGATACTTTA 150 0 

AGAAACAGGG TTTCTCCAAA GAAATTAAAA T AC CTAAAAC CAAATAT GTT GGCTATATCA 1560 

AGGATTATGA AGGAGCCACT TT AAT GGGAT GTGAGCTAAA TCCACGGATC CCGTACACAG 162 0 

AATTTTCTGT CAT C ATT AAA AAGCAGAAGG AGATAATTAA AAAACT GATT GAAAGAAAAC 168 0 

AGGCACAAAT TCGAAAAGTT TACCCTGGAC TTTCATGTTT TAAAGAT GGA GTT C GACAGA 1740 

TTCCTATAGA AAGCATT CCT GGAAT TAGAG AGACAGGCTG GAAACCGAGT GGAAAAGAGA 18 00 

AAAGTAAAGA GCCCAGAGAC CCTGACCAGC TTTACAGCAC GCT CAAGAGC ATCCTCCAGC 18 60 

AGGT GAAGAG CCATCAAAGC GCTTGGCCCT TCATGGAACC TGTGAAGAGA ACAGAAGCTC 192 0 

CAGGATATTA T GAAGTT AT A AGGTCCCCCA TGGATCTCAA AAC CAT GAGT GAACGCCTCA 198 0 

AGAATAGGTA CTACGTGTCT AAGAAAT TAT TCATGGCAGA CTTACAGCGA GTCTTTACCA 2 04 0 

AT T GCAAAGA GTACAACGCC CCTGAGAGTG AATACTACAA AT GT GC C AAT AT CCT GGAGA 210 0 

AATTCTTCTT CAGTAAAATT AAGGAAGCTG GATTAATTGA CAAGTGATTT TTTTTCCCCC 2160 

TCTGCTTCTT AGAAACTCAC CAAGCAGT GT GCCTAAAGCA AGGT 22 0 4 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GAATTCCGGC GAAACCACTC ATGTCTTTGG GCGAAGCCTT CTCCGGTCCA TTTTCACCGT 60 

TACCCGCCGG CAGCTGCTGG AAAAGTTCCG AGT GGAGAAG GACAAATTGG TGCCCGAGAA 12 0 

GAGGACCCTC ATCCTCACTC ACTTCCCCAA GTAAGGCTCC TTCTGGCCTA CCAGGATTTG 180 

GCCCCAAGTT CACATCCTCC CTGTTGTCCC CTTTTTTCCA GGAAGGCTTC CTGGATTGGT 2 40 

CCCTCCTCTC CCTCCATGGG CCTTTTGGGA TCTGGGCGTC TACCTGGCAG ACTTGCCCAT 3 00 

GGCCCAGAAG CAACTTGCTA GTACTAGTCT GGGGAT GGCA GATTCCTGTC CATGCTGGAG 3 60 

GAGGAGAT CT AT GGG GC AAA CTCTCCAATC T GGGAGT C AG GCTTCACCAT GCCACCCTCA 42 0 
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GAGGGGACAC AGCTGGTTCC CCGGCCAGCT TCAGTCAGTG CAGCGGTTGT TCCCAGCACC 48 0 

CCCATCTTCA GCCCCAGCAT GGGT GGGGGC AGCAACAGCT CCCTGAGTCT GGATTCTGCA 54 0 

GGGGCCGAGC CTATGCCAGG CGAGAAGAGG ACGCTCCCAG AGAACCTGAC CCTGGAGGAT 60 0 

GCCAAGCGGC TCCGTGTGAT GGGTGACATC CCCATGGAGC TGGTCAATGA GGTCATGCTG 660 

AC CAT CACT G ACCCTGCTGC CAT GCT GGGG CCTGAGACGA GCCTGCTTTC GGCCAATGCG 72 0 

GCCCGGGATG AGACAGCCCG CCTGGAGGAG CGCCGCGGCA T CAT CGAGTT CC AT GT CAT C 7 80 

GGCAACTCAC TGACGCCCAA GGCCAACCGG CGGGTGTTGC TGTGGCTCGT GGGGCT GCAG 8 40 

AATGTCTTTT CCCACCAGCT GCCGCGCATG CCTAAGGAGT ATATCGCCCG CCTCGTCTTT 900 

GACCCGAAGC AC AAGACT CT GGCCTTGATC AAGGAT GGGC GGGTCATCGG TGGCATCTGC 960 

TTCCGCATGT TTCCCACCCA GGGCTTCACG GAGATTGTCT TCTGTGCTGT CACCTCGAAT 1020 

GAGCAGGTCA AGGGTTATGG GACCCACCTG AT GAACCAC C T GAAGGAGT A T C AC AT CAAG 108 0 

CACAACATTC TCTACTTCCT CACCTACGCC GACGAGTACG CCATCGGCTA CTTCAAAAAG 114 0 

CAGGGTTTCT CCAAGGACAT CAAGGT GCCC AAGAGCCGCT ACCTGGGCTA CAT C AAGGAC 1200 

TACGAGGGAG CGACGCT GAT GGAGT GT GAG CTGAATCCCC GCATCCCCTA CACGGAGCTG 1260 

TCCCACATCA TCAAGAAGCA GAAAGAG AT C AT CAAGAAGC TGATTGAGCG CAAACAGGCC 132 0 

CAGATCCGCA AGGTCTACCC GGGGCT CAGC TGCTTCAAGG AGGGC GT GAG GCAGATCCCT 1380 

GT GGAGAGC G TTCCTGGCAT T C GAGAGAC A GGCTGGAAGC C ATT GGGGAA GGAGAAGGGG 1440 

AAGGAGCTGA AGGACCCCGA CCAGCTCTAC ACAACCCTCA AAAACCTGCT GGCCCAAATC 1500 

AAGTCTCACC CCAGTGCCTG GCCCTTCATG GAGCCTGTGA AGAAGT C GGA GGCCCCTGAC 1560 

TACTACGAGG TCATCCGCTT CCCCATTGAC CTGAAGACCA TGACTGAGCG GCTGCGAAGC 162 0 

CGCTACTACG TGACCCGGAA GCTCTTTGTG GCCGACCTGC AGCGGGT CAT CGCCAACTGT 168 0 

CGCGAGTACA ACCCCCCGGA CAGCGAGTAC TGCCGCTGTG CCAGCGCCCT GGAGAAGTTC 17 40 

TTCTACTTCA AGCT CAAGGA GGGAGGCCTC ATT GACAAGT AGGCCCATCT TTGGGCCGCA 18 00 

GCCCTGACCT GGAATGTCTC CACCT CGGAT TCTGATCTGA TCCTTAGGGG GTGCCCTGGC 18 60 

CCCACGGACC CGACT CAGCT TGAGACACTC CAGC CAAGGG TCCTCCGGAC CCGATCCTGC 192 0 

AGCT CTTTCT GGACCTT CAG GCACCCCCAA GCGTGCAGCT CTGTCCCAGC CTTCACTGTG 198 0 

TGT GAGAGGT CTCCTGGGTT GGGGCCCAGC CCCTCTAGAG TAGCTGGTGG CCAGGGATGA 2 04 0 

ACCTTGCCCA GCCGTGGTGG CCCCCAGGCC TGGTCCCCAA GAGCCCGGAA TTC 2 093 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9046 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCTTGTTTGT GTGCTAGGCT GGGGGGGAGA GAGGGCGAGA GAGAGCGGGC GAGAGT GGGC 60 

AAGCAGGACG CCGGGCTGAG TGCTAACTGC GGGAC GCAGA GAGT GCGGAG GGGAGT CGGG 120 

TCGGAGAGAG GCGGCAGGGG CCAGAACAGT GGCAGGGGGC CCGGGGCGCA CGGGCTGAGG 18 0 

CGACCCCCAG CCCCCTCCCG TCCGCACACA CCCCCACCGC GGTCCAGCAG CCGGGCCGGC 240 

GTCGACGCTA GGGGGGACCA TTACATAACC CGCGCCCCGG CCGTCTTCTC CCGCCGCCGC 30 0 

GGCGCCCGAA CTGAGCCCGG GGC GGGC GCT CCAGCACTGG CCGCCGGCGT GGGGCGTAGC 360 

AGCGGCCGTA TTATTATTT C GCGGAAAGGA AGGCGAAGGA GGGGAGCGCC GGCGCGAGGA 42 0 

GGGGCCGCCT GCGCCCGCCG CCGGAGC GGG GCCTCCTCGG TGGGCTCCGC GTCGGCGCGG 480 

GCGTGCGGGC GGCGCTGCTC GGCCCGGCCC CCTCGGCCCT CTGGTCCGGC CAGCTCCGCT 540 

CCCGGCGTCC TTGCCGCGCC TCCGCCGGCC GCCGCGCGAT GTGAGGCGGC GGCGCCAGCC 600 

TGGCTCTCGG CTC GGGC GAG TTCTCTGCGG CCATTAGGGG CCGGTGCGGC GGCGGCGCGG 660 

AGCGCGGCGG CAGGAGGAGG GTTCGGAGGG TGGGGGCGCA GGCCCGGGAG GGGGCAC CGG 72 0 

GAGGAGGTGA GTGTCTCTTG TCGCCTCCTC CTCTCCCCCC TTTTCGCCCC CGCCTCCTTG 7 80 

T GGC GAT GAG AAGGAGGAGG ACAGCGCCGA GGAGGAAGAG GTTGAT GGC G GCGGCGGAGC 840 

T C C GAGAGAC CTCGGCTGGG CAGGGGC CGG CCGTGGCGGG CCGGGGACTG CGCCTCTAGA 9 00 

GCCGCGAGTT CTCGGGAATT CGCCGCAGCG GACCGGCCTC GGCGAATTTG TGCTCTTGTG 960 

CCCTCCTCCG GGCTTGGGCC AGGCCGGCCC CTCGCACTTG CCCTTACCTT TTCTATCGAG 1020 

.TCCGCATCCC TCTCCAGCCA CTGCGACCCG GCGAAGAGAA AAAGGAACTT CCCCCACCCC 1080 

CTCGGGTGCC GTCGGAGCCC CCCAGCCCAC CCCTGGGTGC GGCGCGGGGA CCCCGGGCCG 1140 

AAGAAGAGAT TTCCTGAGGA TTCTGGTTTT CCTCGCTTGT ATCTCCGAAA GAATTAAAAA 12 00 

TGGCCGAGAA T GT GGT GGAA CCGGGGCCGC CTTCAGCCAA GCGGCCTAAA CTCTCATCTC 12 60 

CGGCCCTCTC GGCGTCCGCC AGCGATGGCA CAGATTTTGG CTCTCTATTT GACTTGGAGC 1320 

AC GACTTAC C AGATGAATTA AT C AACT CT A CAGAATTGGG ACTAACCAAT GGT GGT GAT A 138 0 
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TTAATCAGCT TCAGACAAGT CTTGGCATGG TACAAGATGC AGCTTCTAAA CATAAACAGC 14 4 0 

TGTCAGAATT GCTGCGATCT GGTAGTTCCC CTAACCTCAA TATGGGAGTT GGTGGCCCAG 15 00 

GTCAAGTCAT GGCCAGCCAG GCCCAACAGA GCAGTCCTGG ATTAGGTTTG ATAAATAGCA 1560 

TGGTCAAAAG CCCAATGACA CAGGCAGGCT TGACTTCTCC CAACAT GGGG ATGGGCACTA 1620 

GTGGAC CAAA TCAGGGTCCT ACGCAGT CAA CAGGTAT GAT GAACAGTCCA GTAAAT CAGC 1680 

CTGCCATGGG AATGAACACA GGGACGAATG CGGGCATGAA TCCTGGAATG TTGGCTGCAG 17 40 

GCAAT GGACA AGGGATAATG CCTAATCAAG TCATGAACGG TTCAATTGGA GCAGGCCGAG 18 00 

GGCGACAGGA TAT GCAGT AC CCAAACCCAG GCATGGGAAG TGCTGGCAAC TTACTGACTG 18 60 

AGCCTCTTCA GCAGGGCTCT CCCCAGATGG GAGGACAAAC AGGATT GAGA GGCCCCCAGC 1920 

CTCTTAAGAT GGGAAT GAT G AACAACCCCA ATCCTTATGG TTCACCATAT ACT CAGAAT C 19 8 0 

CTGGACAGCA GATTGGAGCC AGTGGCCTTG GTCTCCAGAT TCAGACAAAA ACT GT ACT AT 20 4 0 

CAAATAACTT AT CTC CATTT GC TAT GGACA AAAAGGCAGT TCCTGGTGGA GGAATGCCCA 2100 

ACATGGGTCA ACAGCCAGCC CCGCAGGTCC AGCAGCCAGG TCTGGTGACT CCAGTTGCCC 2160 

AAGGGATGGG TTCTGGAGCA CAT ACAGC T G AT CCAGAGAA GCGCAAGCTC ATCCAGCAGC 22 2 0 

AGCTTGTTCT CCTTTTGCAT GCTCACAAGT GCCAGCGCCG GGAACAGGCC AATGGGGAAG 22 80 

TGAGGCAGTG CAACCTTCCC CACTGTCGCA CAAT GAAGAA TGTCCTAAAC CACATGACAC 23 40 

ACTGCCAGTC AGGCAAGTCT TGCCAAGTGG CACACTGTGC ATCTTCTCGA CAAATCATTT 24 00 

CACACT GGAA GAATT GTACA AGACATGATT GTCCTGTGTG TCTCCCCCTC AAAAAT GCT G 24 60 

GTGATAAGAG AAATCAACAG CCAATTTTGA CTGGAGCACC CGTTGGACTT GGAAAT CCTA 252 0 

GCTCTCTAGG GGTGGGTCAA CAGTCTGCCC CCAACCTAAG CACTGTTAGT CAGATT GAT C 2580 

CCAGCTCCAT AGAAAGAGCC TAT GCAGCT C TTGGACTACC CTAT CAAGTA AATCAGATGC 2 640 

CGACACAACC CCAGGTGCAA GCAAAGAACC AGCAGAAT C A GCAGCCT GGG CAGTCTCCCC 27 00 

AAGGCATGCG GCCCATGAGC AACATGAGTG CTAGTCCTAT GGGAGTAAAT GGAGGT GT AG 27 60 

GAGTTCAAAC GCCGAGTCTT CTTTCTGACT CAATGTTGCA T T CAGC CAT A AATTCTCAAA 2 820 

ACCCAATGAT GAGT GAAAAT GCCAGTGTGC CCTCCCTGGG T C CTAT GC C A ACAGCAGCTC 28 80 

AACCATCCAC T ACT GGAATT CGGAAACAGT GGCACGAAGA TAT TACT CAG GATCTTCGAA 2 94 0 

AT CAT CTT GT TCACAAACTC GTCCAAGCCA TATTTCCTAC GCCGGATCCT GCTGCTTTAA 3000 

AAGACAGACG GAT GGAAAAC CTAGTTGCAT AT GCT C GGAA AGTTGAAGGG GACAT GTATG 3060 

AATCTGCAAA CAATCGAGCG GAAT ACT AC C ACCTTCTAGC TGAGAAAATC TATAAGATCC 312 0 

AGAAAGAACT AGAAGAAAAA CGAAGGACCA GACTACAGAA GCAGAACATG CTACCAAATG 318 0 

CTGCAGGCAT GGTTCCAGTT T C CAT GAAT C CAGGGCCTAA CAT GG GAC AG CCGCAACCAG 32 40 

GAATGACTTC TAATGGCCCT CTACCTGACC CAAGTATGAT CCGTGGCAGT GTGCCAAACC 33 00 

AGAT GAT GC C TCGAATAACT CCACAATCTG GTTTGAATCA ATTTGGCCAG AT GAGCAT GG 33 60 

CCCAGCCCCC TATT GT AC C C CGGCAAACCC CTCCTCTTCA GCAC CAT GGA CAGTTGGCTC 342 0 

AAC CT GGAGC TCTCAACCCG CCTATGGGCT ATGGGCCTCG T AT GCAACAG CCTTCCAACC 34 8 0 

AGGGCCAGTT CCTTCCTCAG ACTCAGTTCC CAT C AC AGGG AAT GAATGTA AC AAAT AT C C 354 0 

CTTTGGCTCC GTCCAGCGGT CAAGCT C CAG TGTCTCAAGC ACAAATGTCT AGTTCTTCCT 3600 

GCCCGGTGAA CTCTCCTATA AT GC CTC CAG GGTCTCAGGG GAGCCACATT CACTGTCCCC 3 660 

AGCTTCCTCA ACCAGCTCTT CATCAGAATT CACCCTCGCC TGTACCTAGT CGTACCCCCA 37 2 0 

CCCCTCACCA TACTCCCCCA AGCATAGGGG CTCAGCAGCC ACCAGCAACA ACAATT C CAG 37 8 0 

CCCCTGTTCC TACACCACCA GCCATGCCAC CTGGGCCACA GTCCCAGGCT CTACATCCCC 38 4 0 

CT CCAAGGCA GACACCTACA CCACCAACAA CACAACTTCC CCAACAAGTG CAGCCTTCAC 3900 

TTCCTGCTGC ACCTT CT GCT GACCAGCCCC AGCAGCAGCC TCGCTCACAG CAGAGCACAG 3960 

CAGCGTCTGT TCCTACCCCA AACGCACCGC TGCTTCCTCC GCAGCCTGCA ACTCCACTTT 402 0 

CCCAGCCAGC TGTAAGCATT GAAGGACAGG TATCAAATCC T C CAT CTACT AGTAGCACAG 40 8 0 

AAGT GAATT C TCAGGCCATT GCTGAGAAGC AGCCTTCCCA GGAAGT GAAG AT GGAGGC CA 4140 

AAAT GGAAGT GGATCAAC CA GAACCAGCAG ATACGCAGCC GGAGGATATT TCAGAGTCTA 42 00 

AAGT GGAAGA CT GTAAAAT G GAAT CT AC CG AAACAGAAGA GAGAAGCACT GAGTTAAAAA 42 60 

CT GAAATAAA AGAGGAGGAA GACCAGCCAA GTACTTCAGC TACCCAGTCA TCTCCGGCTC 4 32 0 

CAGGACAGTC AAAGAAAAAG ATTTTCAAAC CAGAAGAACT ACGACAGGCA CTGATGCCAA 4 38 0 

CATT GGAGGC ACTTTACCGT CAGGATCCAG AATCCCTTCC CTTTCGTCAA CCTGTGGACC 4 44 0 

CTCAGCTTTT AGGAATCCCT GAT TACTTT G AT ATTGT GAA GAGCCCCATG GATCTTTCTA 4 50 0 

CCATTAAGAG GAAGTTAGAC ACT GGACAGT AT CAGGAGC C CTGGCAGTAT GT C GAT GAT A 4 560 

TTTGGCTTAT GTT CAAT AAT GCCTGGTTAT AT AAC C GGAA AAC AT C AC G G GTATACAAAT 462 0 

ACTGCTCCAA GCTCTCTGAG GTCTTTGAAC AAGAAATTGA CCCAGTGATG CAAAGCCTTG 4 68 0 

GAT ACT GTT G TGGCAGAAAG TTGGAGTTCT CTCCACAGAC ACTGTGTTGC TACGGCAAAC 4 74 0 

AGTT GTGCAC AATACCTCGT GATGCCACTT ATTACAGTTA CCAGAACAGG TAT CATTT CT 4 8 00 

GTGAGAAGTG TT T CAAT GAG AT CCAAGGGG AGAGCGTTTC TTTGGGGGAT GACCCTTCCC 4 86 0 

AGCCTCAAAC TACAATAAAT AAAGAACAAT TT T CCAAGAG AAAAAATGAC AC ACT GGAT C 4 92 0 

CTGAACTGTT TGTTGAATGT ACAGAGT GCG GAAGAAAGAT GCATCAGATC TGTGTCCTTC 4 98 0 

ACCAT GAGAT CATCTGGCCT GCTGGATTCG TCTGTGATGG CTGTTTAAAG AAAAGT GCAC 5 04 0 

GAACTAGGAA AGAAAATAAG TTTTCTGCTA AAAGGTTGCC AT CTACCAGA CTTGGCACCT 510 0 

TTCTAGAGAA T C GT GT GAAT GACTTTCTGA GGC GACAGAA TCACCCTGAG TCAGGAGAGG 5160 
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TCACTGTTAG AGTAGTTCAT GCTTCTGACA AAACCGTGGA AGTAAAAC C A GGC AT GAAAG 522 0 

CAAGGTTT GT GGACAGTGGA GAGAT GGCAG AATCCTTTCC ATACCGAACC AAAGCCCTCT 528 0 

TTGCCTTTGA AGAAATTGAT GGTGTTGACC TGTGCTTCTT TGGCATGCAT GTTCAAGAGT 5 34 0 

ATGGCTCTGA CTGCCCTCCA CCCAACCAGA GGAGAGTATA CAT AT CT TAC CTCGATAGTG 540 0 

TTCATTTCTT CCGTCCTAAA TGCTTGAGGA CTGCAGTCTA T CAT GAAAT C CTAATT GGAT 5460 

ATTTAGAATA TGTCAAGAAA TTAGGTTACA CAACAGGGCA TATTTGGGCA T GT CC AC CAA 552 0 

GTGAGGGAGA TGATTATATC TTCCATTGCC ATCCTCCTGA C C AGAAGAT A CCCAAGCCCA 558 0 

AGCGACT GCA GGAAT GGT AC AAAAAAATGC TTGACAAGGC TGTATCAGAG CGTATTGTCC 564 0 

AT GACT ACAA GGATATTTTT AAACAAGCTA CT GAAGAT AG ATTAACAAGT GCAAAGGAAT 57 00 

TGCCTTATTT C GAGGGT GAT TTCTGGCCCA ATGTTCTGGA AGAAAGCATT AAGGAACTGG 57 60 

AACAGGAGGA AGAAGAGAGA AAACGAGAGG AAAAC AC C AG CAAT GAAAGC ACAGATGTGA 582 0 

CCAAGGGAGA CAGCAAAAAT GCTAAAAAGA AGAATAATAA G AAAAC C AG C AAAAATAAGA 588 0 

GCAGCCTGAG TAGGGGCAAC AAGAAGAAAC CCGGGATGCC CAAT GT AT CT AACGACCTCT 594 0 

CACAGAAACT ATATGCCACC AT GGAGAAGC ATAAAGAGGT CTTCTTTGTG ATCCGCCTCA 6000 

TTGCTGGCCC TGCTGCCAAC TCCCTGCCTC CCATTGTTGA TCCTGATCCT CTCATCCCCT 6060 

GCGATCT GAT GGAT GGT C GG GATGCGTTTC TCACGCTGGC AAGGGACAAG CACCTGGAGT 6120 

TCTCTTCACT CCGAAGAGCC CAGTGGTCCA C CAT GT GC AT GCT GGTGGAG CTGCACACGC 618 0 

AGAGCCAGGA CCGCTTTGTC T ACAC CT GCA ATGAAT GCAA GCAC CAT GT G GAGACAC GCT 62 4 0 

GGCACT GT AC T GT CT GT GAG GATT AT GACT TGTGTATCAC CT GCT ATAAC ACTAAAAAC C 6300 

ATGACCACAA AATGGAGAAA CTAGGCCTTG GCTTAGATGA TGAGAGCAAC AACCAGCAGG 63 60 

CTGCAGCCAC CCAGAGCCCA GGC GATT CTC GC CGCCT GAG TAT C CAGCGC TGCAT CCAGT 642 0 

CTCTGGTCCA TGCTTGCCAG TGTCGGAATG CCAATTGCTC ACTGCCATCC T GC CAGAAGA 64 8 0 

TGAAGCGGGT TGTGCAGCAT AC CAAGGGTT GCAAACGGAA AAC CAAT GGC GGGTGCCCCA 654 0 

TCTGCAAGCA GCTCATTGCC CTCTGCTGCT ACCATGCCAA GCACTGCCAG GAGAACAAAT 6600 

GCCCGGTGCC GTTCTGCCTA AACAT CAAGC AGAAGCTCCG GCAGCAACAG CTGCAGCACC 6660 

GACTACAGCA GGCCCAAATG CTTCGCAGGA GGAT GGC CAG CATGCAGCGG ACT GGT GT GG 67 2 0 

TTGGGCAGCA ACAGGGCCTC CCTTCCCCCA CTCCTGCCAC TCCAACGACA CCAACTGGCC 67 80 

AACAGCCAAC CACCCCGCAG ACGCCCCAGC CCACTTCTCA GCCTCAGCCT ACCCCTCCCA 68 4 0 

AT AGC AT GC C ACCCTACTTG CCCAGGACTC AAGCTGCTGG CCCTGTGTCC CAGGGTAAGG 6900 

CAGCAGG C C A GGTGACCCCT CCAACCCCTC CT CAGACT GC TCAGCCACCC CTTCCAGGGC 69 60 

CCCCACCTAC AGCAGTGGAA ATGGCAATGC AGATTCAGAG AGCAGCGGAG ACGCAGCGCC 7 02 0 

AGAT GGC CCA C GT GCAAATT TTT CAAAGGC CAATCCAACA CCAGATGCCC CC GAT GACT C 7 08 0 

CCATGGCCCC CAT GGGTATG AACCCACCTC CCATGACCAG AGGT CCCAGT GGGCATTT GG 714 0 

AGC CAGGGAT GGGACCGACA GGGATGCAGC AACAGCCACC CT GGAGCCAA GGAGGATTGC 720 0 

CTCAGCCCCA GCAACTACAG TCT GGGATGC CAAGGC CAGC CATGATGTCA GTGGCCCAGC 7 260 

AT GGT C AAC C TTTGAACAT G GCTCCACAAC CAGGATTGGG CCAGGTAGGT ATCAGCCCAC 7 32 0 

TCAAACCAGG C ACT GT GT CT CAACAAGCCT T ACAAAAC CT TTTGCGGACT CTCAGGTCTC 7 38 0 

CCAGCTCTCC CCTGCAGCAG CAACAGGT GC TTAGTAT CCT TCACGCCAAC CCCCAGCTGT 7 440 

TGGCTGCATT CAT CAAGC AG CGGGCTGCCA AGT AT GCCAA CTCTAATCCA CAACCCATCC 7 500 

CTGGGCAGCC TGGCATGCCC CAGGGGCAGC CAGGGCTACA GCCACCTACC AT GC C AGGT C 7 5 60 

AGCAGGGGGT CCACTCCAAT CCAGCCATGC AGAAC AT GAA TCCAATGCAG GCGGGCGTTC 7 62 0 

AGAGGGCTGG CCTGCCCCAG CAGCAACCAC AGCAGCAACT CCAGCCACCC AT GGGAGGGA 7 68 0 

TGAGCCCCCA GGCT CAGCAG ATGAACATGA ACCACAACAC CAT GC CTT CA CAATTCCGAG 774 0 

ACATCTTGAG ACGACAGCAA AT GAT GCAAC AGCAGCAGCA ACAGGGAGCA GGGC CAGGAA 7 800 

TAGGCCCTGG AATGGCCAAC CATAACCAGT TCCAGCAACC CCAAGGAGTT GGCTACCCAC 7860 

CACAGCCGCA GCAGCGGATG CAGCAT CACA TGCAACAGAT GCAACAAGGA AAT AT GGGAC 7 92 0 

AGATAGGCCA GCTTCCCCAG GC CTT GGGAG CAGAGGCAGG TGCCAGTCTA CAGGCCTATC 79 8 0 

AGCAGC GACT CCTTCAGCAA CAGAT GGGGT CCCCTGTTCA GCCCAACCCC ATGAGCCCCC 8040 

AGCAGCATAT GCTCCCAAAT CAGGCC CAGT CCCCACACCT ACAAGGCCAG CAGATCCCTA 8100 

ATTCTCTCTC CAAT CAAGTG CGCTCTCCCC AGCCTGTCCC TTCTCCACGG CCACAGTCCC 8160 

AGCCCCCCCA CTCCAGTCCT TCCCCAAGGA TGCAGCCTCA GCCTTCTCCA CAC CACGTTT 8220 

CCCCACAGAC AAGTTCCCCA CAT CCT GGAC TGGTAGCTGC CCAGGCCAAC CCCATGGAAC 82 8 0 

AAGGGCATTT TGCCAGCCCG GACCAGAATT CAATGCTTTC TCAGCTTGCT AGCAAT CCAG 83 40 

GCAT GGCAAA CCT C CAT GGT GCAAGC GCC A CGGACCTGGG ACTCAGCACC GATAACT CAG 8 400 

ACTTGAATTC AAACCTCTCA CAGAGTACAC TAGACATACA CTAGAGACAC CTTGTATTTT 8460 

GGGAGCAAAA AAATTATTTT CTCTTAACAA GACTTTTTGT ACTGAAAACA ATTTTTTTGA 8 520 

ATCTTTCGTA GC CT AAAAGA CAATTTTCCT TGGAACACAT AAGAACT GT G CAGTAGCCGT 8 58 0 

TTGTGGTTTA AAGCAAACAT GC AAGAT GAA C CT GAGGGAT GATAGAATAC AAAGAATATA 8 64 0 

TTTTTGTTAT GGGCTGGTTA CCACCAGCCT TTCTTCCCCT TTGTGTGTGT GGTTCAAGTG 87 0 0 

TGCACT GGGA GGAGGCT GAG GCCTGTGAAG CCAAACAATA TGCTCCTGCC TTGCACCTCC 87 60 

AAT AGGT TTT ATTATTTTTT TTAAATTAAT GAACATATGT AATATTAATG AAC AT AT GT A 8 820 

AT AT T AAT AG TTATTATTTA CT GGT GCAGA TGGTTGACAT TTTTCCCTAT TTTCCTCACT 8 88 0 

T TAT GGAAGA GT T AAAACAT TTCTAAACCA GAGGACAAAA GGGGTTAATG TTACTTTGAA 8 94 0 
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ATTACATTCT ATATATATAT AAATATATAT AAATATATAT T AAAAT AC C A GTTTTTTTTC 90 00 
TCTGGGTGCA AAGATGTTCA TTCTTTTAAA AAATGTTTAA AAAAAA 9 04 6 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7326 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

AT GGC C GAGA ACTTGCTGGA CGGACCGCCC AAC C C C AAAC GAGCCAAACT CAGCTCGCCC 60 

GGCTTCTCCG CGAATGACAA CACAGATTTT GGATCATTGT TTGACTTGGA AAAT GAC CTT 120 

CCTGATGAGC TGATCCCCAA T GGAGAAT T A AGCCTTTTAA AC AGT GG GAA CCTTGTTCCA 18 0 

GATGCTGCGT CCAAACATAA ACAACTGTCA GAGCTTCTTA GAGGAGGCAG CGGCT CTAGC 24 0 

ATCAACCCAG GGATAGGCAA TGTGAGTGCC AGCAGCCCTG TGCAACAGGG CCTTGGTGGC 300 

CAGGCTCAGG GGCAGCCGAA CAGTACAAAC ATGGCCAGCT TAGGTGC CAT GGGCAAGAGC 36 0 

CCTCTGAACC AAGGAGACTC ATCAACACCC AACCTGCCCA AACAGGCAGC CAGCACCTCT 42 0 

GGGCCCACTC CCCCTGCCTC CCAAGCACTG AATCCACAAG CACAAAAGCA AGTAGGGCTG 48 0 

GTGACCAGTA GTCCTGCCAC AT CACAGACT GGACCTGGGA TCTGCATGAA TGCTAACTTC 54 0 

AACCAGACCC ACCCAGGCCT TCTCAATAGT AACTCTGGCC ATAGCTTAAT GAAT CAGGCT 600 

CAACAAGGGC AAGCT CAAGT CATGAATGGA TCTCTTGGGG CTGCTGGAAG AGGAAGGGGA 660 

GCTGGAATGC CCTACCCTGC T C CAGC CAT G CAGGGGGCCA CAAGCAGTGT GCT GGCGGAG 72 0 

ACCTTGACAC AGGTTTCCCC ACAAAT GGCT GGCCATGCTG GACTAAATAC AGCACAGGCA 7 80 

GGAGGCATGA CCAAGATGGG AATGACTGGT ACCACAAGTC CATTTGGACA ACCCTTTAGT 84 0 

CAAACT GGAG GGCAGCAGAT GGGAGC CACT GGAGTGAACC CCCAGTTAGC CAGCAAACAG 90 0 

AGCAT GGTCA AT AGT TT AC C TGCTTTTCCT ACAGAT AT CA AGAATACTTC AGT CAC CACT 960 

GTGCCAAATA TGTCCCAGTT GCAAACAT CA GTGGGAATTG TACCCACACA AGCAATTGCA 102 0 

ACAGGCCCCA CAGCAGACCC TGAAAAACGC AAAC T GAT AC AGCAGCAGCT GGTTCTACTG 108 0 

CTTCATGCCC ACAAAT GT C A GAGACGAGAG CAAGCAAATG GAGAGGTTCG NGCCT GTTCT 114 0 

CTCCCACACT GT CGAAC CAT GAAAAACGTT TTGAAT CACA TGACACATTG TCAGGCTCCC 12 0 0 

AAAGCCTGCC AAGTTGCCCA TTGTGCATCT T CAC GAC AAA TCATCTCTCA TT GGAAGAAC 126 0 

T GCACAC GAC AT GACTGT CC TGTTTGCCTC CCTTTGAAAA ATGCCAGTGA CAAGCGAAAC 132 0 

CAACAAACCA TCCTGGGATC TCCAGCTAGT GGAATT CAAA ACACAATTGG TTCTGTTGGT 138 0 

GCAGGGCAAC AGAATGCCAC TTCCTTAAGT AACCCAAATC CCATAGACCC CAGTT C CAT G 144 0 

CAGCGGGCCT ATGCTGCTCT AGGACTCCCC T AC AT GAAC C AGCCTCAGAC GCAGCTGCAG 150 0 

CCTCAGGTTC CTGGC CAGC A AC CAGCAC AG CCTCCAGCCC ACCAGCAGAT GAGGACTCTC 1560 

AATGCCCTAG GAAACAACCC CAT GAGT GT C CCAGCAGGAG GAATAACAAC AGAT C AAC AG 162 0 

CCACCAAACT TGATTT CAGA AT CAGCT CTT CCAACTTCCT TGGGGGCTAC CAATCCACTG 168 0 

AT GAAT GAT G GTT CAAACT C TGGTAACATT GGAAGCCTCA GCAC GAT AC C TACAGCAGCG 17 4 0 

CCTCCTTCCA GCACT GGTGT TCGAAAAGGC T GGCAT GAAC ATGTGACTCA GGAC C T AC GG 180 0 

AGT CAT CT AG T C CAT AAACT CGTTCAAGCC ATCTTCCCAA CTCCAGACCC TGCAGCTCTG 1860 

AAAGATCGCC GCATGGAGAA CCTGGTTGCC TAT GCTAAGA AAGTGGAGGG AGAC AT GT AT 192 0 

GAGTCTGCTA ATAGCAGGGA TGAATACTAT CATTTATTAG CAGAGAAAAT CTATAAAATA 198 0 

CAAAAAGAAC TAGAAGAAAA GCGGAGGACA C GTT TACAT A AGCAAGGCAT C CT GG GT AAC 2 04 0 

CAGCCAGCTT TACCAGCTTC TGGGGCTCAG CCCCCTGTGA TTCCACCAGC CCAGTCTGTA 210 0 

AGACCTCCAA ATGGGCCCCT GCCTTTGCCA GTGAATCGCA TGCAGGTTTC TCAAGGGATG 2160 

AATTCATTTA ACCCAAT GT C CCTGGGAAAC GTCCAGTTGC CACAGGCACC CAT GGGAC CT 222 0 

CGTGCAGCCT CCCCTATGAA CCACTCTGTG CAGATGAACA GCATGGCCTC AGTTCCGGGT 22 8 0 

ATGGCCATTT CTCCTTCACG GATGCCTCAG CCTCCAAATA T GAT GGGCA C T CAT GCCAAC 2 34 0 

AACATTATGG CCCAGGCACC T AC T CAGAAC CAGTTTCTGC CACAGAACCA GT TT C CAT CA 240 0 

TCCAGTGGGG CAATGAGT GT GAACAGTGTG GGCATGGGGC AACCAGCAGC CCAGGCAGGT 2 4 60 

GTTTCACAGG GT CAGGAAC C TGGAGCTGCT CTCCCTAACC CT CT GAAC AT GCTGGCACCC 2 52 0 

CAGGCCAGCC AGCTGCCTTG CCCACCAGTG ACACAGTCAC CATTGCACCC GACTCCACCT 2 58 0 

CCTGCTTCCA CAGCT GCT GG CATGCCCTCT CTCCAACATC CAACGGCACC AGGAAT GAC C 2640 

CCTCCTCAGC CAGCAGCTCC CACTCAGCCA TCTACTCCTG TGTCATCTGG GCAGACTCCT 270 0 

ACCCCAACTC CTGGCTCAGT GCCCAGCGCT GCCCAAACAC AGAGTACCCC T AC AGT C C AG 2 7 60 

GCAGCAGCAC AGGCTCAGGT GACTCCACAG CCTCAGACCC CAGTGCAGCC AC C AT CT GT G 2 82 0 

GCTACTCCTC AGT CAT CACA GCAGCAAC CA AC GC CT GT GC AT ACT C AGC C ACCTGGCACA 2 88 0 

CCGCTTTCTC AGGCAGCAGC C AGCATT GAT AATAGAGTCC CTACTCCCTC CACT GT GAC C 2 94 0 
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AGTGCTGAAA CCAGTTCCCA GCAGCCAGGA CCCGATGTGC CCATGCTGGA AATGAAGACA 3000 

GAGGTGCAGA CAGATGATGC TGAGCCTGAA CCTACT GAAT CCAAGGGGGA ACCTCGGTCT 3 060 

GAGATGATGG AAGAGGATTT ACAAGGTTCT TCCCAAGTAA AAGAAGAGAC AGATACGACA 312 0 

GAGCAGAAGT CAGAGCCAAT GGAAGTAGAA GAAAAGAAAC CT GAAGT AAA AGTGGAAGCT 318 0 

AAAGAGGAAG AAGAGAACAG TTCGAACGAC ACAGCCTCAC AAT CAACAT C TCCTTCCCAG 32 4 0 

CCACGCAAAA AAAT CTTTAA ACCCGAGGAG CTACGCCAGG CACTTATGCC AACTCTAGAA 33 00 

GCACTCTATC GACAGGACCC AGAGTCTTTG CCTTTTCGTC AGCCTGTAGA TCCTCAGCTC 3 360 

CTAGGAATCC CAGATTATTT TGATATAGTG AAGAATCCTA TGGACCTTTC TACCAT CAAA 342 0 

CGAAAGCTGG AC AC AG GGC A AT AT C AAGAA CCCTGGCAGT ATGTGGATGA TGTCAGGCTT 34 8 0 

ATGTTCAACA ATGCGTGGCT AT AT AAT C GT AAAACGTCCC GTGTATATAA ATTTTGCAGT 35 40 

AAACTTGCAG AGGTCTTTGA ACAAGAAATT GACCCTGTCA TGCAGTCTCT TGGATATTGC 3600 

TGTGGACGAA AGTAT GAGTT CTCCCCACAG ACTTTGTGCT GTTAC GGAAA GCAGCTGTGT 3660 

ACAATTCCTC GT GAT GCAGC CTACTACAGC TATCAGAATA GGTAT C AT TT CTGTGGGAAG 37 20 

TGTTTCACAG AGATCCAGGG CGAGAATGTG ACCCTGGGTG ACGACCCTTC CCAACCTCAG 37 8 0 

AC GAC AATTT CCAAGGATCA ATTTGAAAAG AAGAAAAATG ATACCTTAGA TCCTGAACCT 38 40 

TTTGTTGACT GCAAAGAGTG TGGCCGGAAG ATGCATCAGA TTTGTGTTCT ACACTATGAC 39 00 

AT CATTT GGC CTTCAGGTTT TGTGTGTGAC AACT GT TT GA AGAAAACTGG CAGACCTCGG 3960 

AAAGAAAACA AATTCAGTGC TAAGAGGCTG CAGACCACAC GATT GGGAAA CCACTTAGAA 402 0 

GACAGAGTGA ATAAGTTTTT GCGGCGCCAG AATCACCCTG AAGCT GGGGA GGTTTTTGTC 408 0 

AGAGTGGTGG CCAGCTCAGA CAAGACTGTG GAGGTCAAGC CGGGAATGAA GTCAAGGTTT 4140 

GTGGATTCTG GAGAGATGTC GGAATCTTTC CCATATCGTA CCAAAGCACT CTTTGCTTTT 4200 

GAGGAGATCG AT GGAGT CGA TGTGTGCTTT TTTGGGATGC ATGTGCAAGA TACGGCTCTG 4260 

ATTGCCCCCC AC CAAAT AC A AGGCTGTGTA TACATATCTT AT CT GGACAG T ATT C AT TT C 432 0 

TTCCGGCCCC GCTGCCTCCG GACAGCT GTT TACCATGAGA TCCTCATCGG ATATCTCGAG 438 0 

TATGTGAAGA AATTGGTGTA TGTGACAGCA CATATTTGGG CCTGTCCCCC AAGTGAAGGA 4440 

GATGACTATA TCTTTCATTG CCACCCCCCT GACCAGAAAA TCCCCAAACC AAAACGACTA 4 500 

CAGGAGT GGT ACAAGAAGAT GCTGGACAAG GCGTTTGCAG AGAGGATCAT TAACGACTAT 4 560 

AAGGAC AT C T TCAAACAAGC GAAC GAAGAC AGGCTCACGA GT GCCAAGGA GTTGCCCTAT 4 62 0 

TTTGAAGGAG ATTTCTGGCC TAATGTGTTG GAAGAAAGCA TTAAGGAACT AGAACAAGAA 4 68 0 

GAAGAAGAAA GGAAAAAAGA AGAGAGTACT GCAGC GAGT G AGACT C CT GA GGGCAGT CAG 47 4 0 

GGTGACAGCA AAAATGCGAA GAAAAAGAAC AACAAGAAGA CCAACAAAAA CAAAAGCAGC 4800 

ATTAGCCGCG CCAACAAGAA GAAGCCCAGC AT GC C C AAT G TTTCCAACGA CCTGT CGCAG 4860 

AAGCT GT AT G CCAC CAT GGA GAAGCACAAG GAGGTATTCT TTGTGATTCA TCTGCATGCT 492 0 

GGGCCTGTTA TCAGCACTCA GCCCCCCATC GTGGACCCTG ATCCTCTGCT TAGCTGTGAC 498 0 

CT CAT GGAT G GGC GAGATGC CTTCCTCACC CT GGC CAGAG ACAAGCACTG GGAATTCTCT 5040 

TCCTTACGCC GCTCCAAATG GTCCACTCTG TGCATGCTGG T GGAGCT GCA CACACAGGGC 5100 

CAGGAC CGCT T T GTTT AT AC CT GCAAT GAG TGCAAACACC AT GTGGAAAC ACGCT GGCAC 5160 

TGCACTGTGT GT GAGGACTA TGACCTTTGT AT CAATT GCT ACAACACAAA GAGCCACACC 522 0 

CATAAGATGG T GAAGT GGGG GCTAGGC CTA GATGATGAGG GCAGCAGTCA GGGTGAGCCA 52 80 

CAGTCCAAGA GCCCCCAGGA ATCCCGGCGT CTCAGCATCC AGCGCTGCAT CCAGTCCCTG 5340 

GT G CAT GC CT GCCAGTGTCG CAATGC CAAC TGCTCACTGC CGTCTTGCCA GAAGATGAAG 54 00 

C GAGT C GT GC AGCACACCAA GGGCT GCAAG CGCAAGACTA AT GGAGGAT G CCCAGTGTGC 5460 

AAGCAGCTCA TTGCTCTTTG CTGCTACCAC GCCAAACACT GC CAAGAAAA TAAATGCCCT 5520 

GTGCCCTTCT GCCT CAACAT CAAACATAAC GTCCGCCAGC AGCAGAT CCA GCACTGCCTG 5580 

CAGCAGGCTC AGCT CAT GCG CCGGCGAATG GC AAC CAT GA ACACCCGCAA T GT GC CT CAG 5 640 

CAGAGTT T GC CTTCTCCTAC CTCAGCACCA CCCGGGACTC CTACACAGCA GCCCAGCACA 5700 

CCCCAAACAC CACAGCCCCC AGCCCAGCCT CAGCCTTCAC CTGTTAACAT GTCACCAGCA 5760 

GGCTTCCCTA ATGTAGCCCG GACTCAGCCC C CAACAAT AG TGTCTGCTGG GAAGC CT AC C 5820 

AACCAGGT GC CAGCTCCCCC ACCCCCTGCC CAGCCCCCAC CT GCAGCAGT AGAAGCAGCC 588 0 

CGGCAAATTG AACGT GAGGC C CAG CAG CAG CAGCACCTAT ACCGAGCAAA CAT CAACAAT 594 0 

GGCATGCCCC CAGGAC GT GA CGGTATGGGG ACCCCAGGAA GC CAAAT GAC TCCTGTGGGC 6000 

CTGAATGTGC CCCGTCCCAA CCAAGT CAGT GGGCCTGTCA T GT CT AGTAT GCCACCTGGG 6060 

CAGTGGCAGC AGGCACCCAT CCCTCAGCAG CAGCCGATGC CAGGCATGCC CAGGCCTGTA 6120 

ATGT C CAT GC AGGCCCAGGC AGCAGTGGCT GGGCCACGGA TGCCCAATGT GCAGCCAAAC 6180 

AGGAGCAT CT CGCCAAGTGC CCT GCAAGAC CT GCT AC GGA CCCTAAAGTC ACCCAGCTCT 6240 

CCTCAGCAGC AGCAGCAGGT GCTGAACATC CTTAAATCAA ACCCACAGCT AAT GGC AGCT 63 00 

TT CAT CAAAC AGC GCACAGC CAAGTATGTG GCCAATCAGC CTGGCATGCA GCCCCAGCCC 6360 

GGACTTCAAT CCCAGCCTGG TAT GCAGC C C CAGCCTGGCA T GC AC C AGC A GCCTAGTTTG 642 0 

CAAAACCTGA AC GCAAT GCA AGCT GGT GTG CCACGGCCTG GTGTGCCTCC ACCACAACCA 6480 

GCAAT GGGAG GCCT GAAT C C CCAGGGACAA GCTCTGAACA T CAT GAACC C AGGACACAAC 654 0 

C CCAACAT GA CAAACATGAA TCCACAGTAC CGAGAAATGG T GAGGAGAC A GCTGCTACAG 6600 

CACCAGCAGC AGCAGCAGCA ACAGCAGCAG CAGCAGCAGC AACAACAAAA TAGTGCCAGC 6660 

TTGGCCGGGG GCATGGCGGG ACACAGC CAG TTCCAGCAGC CACAAGGACC T GGAGGTT AT 67 20 
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GCCCCAGCCA TGCAGCAGCA ACGCATGCAA CAGCACCTCC CCATCCAGGG CAGCTCCATG 67 80 

GGCCAGATGG CTGCTCCAAT GGGACAACTT GGCCAGATGG GGCAGCCTGG GCTAGGGGCA 68 40 

GACAGCACCC CTAATAT C C A GCAGGCCCTG CAGCAAC GGA TTCTGCAGCA GCAGCAGATG 6 9 00 

AAGCAACAAA TTGGGTCACC AGGCCAGCCG AACCCCATGA GCCCCCAGCA GCACATGCTC 6960 

TCAGGACAGC CACAGGCCTC ACATCTCCCT GGCCAGCAGA TCGCCACATC CCTTAGTAAC 7 02 0 

CAGGT GCGAT CTCCAGCCCC TGTGCAGTCT CCACGGCCCC AATCCCAACC TCCACATTCC 708 0 

AGCCCGTCAC CAC GGATACA ACCCCAGCCT TCACCACACC ATGTTTCACC CCAGACT GGA 714 0 

ACCCCTCACC CTGGACTCGC AGT CAC CAT G GCCAGCTCCA TGGATCAGGG ACACCTGGGG 72 00 

AACCCTGAAC AGAGTGCAAT GCTCCCCCAG CTGAATACCC CCAACAGGAG CGCACTGTCC 72 60 

AGT GAACTGT CCCTGGTTGG T GAT AC CAC G GGAGACACAC TAGAAAAGTT TGT GGAGGGT 732 0 

TTGTAG 7326 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2499 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TCACTTGTCA ATTAATCCAG CTTCCTTAAT TTTACT GAAG AAGAATTTCT CCAGGATATT 60 

GGCACATTTG TAGTATTCAC TCTCAGGGGC GTTGTACTCT TTGCAATTGG TAAAGACTCG 120 

CTGTAAGTCT GCCATGAATA ATTTCTTAGA CACGTAGTAC CTATTCTTGA GGCGTTCACT 180 

CATGGTTTTG AGATCCATGG GGGACCTTAT AACTT CATAA TATCCTGGAG CTTCTGTTCT 24 0 

CTTCA CAGGT T C CAT GAAGG GCCAAGCGCT TTGATGGCTC TTCACCTGCT GGAGGATGCT 300 

CTTGAGCGTG CTGTAAAGCT GGTCAGGGTC TCTGGGCTCT TTACTTTTCT CTTTTCCACT 360 

CGGTTTCCAG CCTGTCTCTC TAATTCCAGG AATGCTTTCT AT AGGAAT CT GTCGAACTCC 42 0 

ATCTTTAAAA CAT GAAAGTC CAGGGTAAAC TTTTCGAATT TGTGCCTGTT TTCTTTCAAT 4 80 

CAGTTTTTTA ATTATCTCCT TCTGCTTTTT AATGATGACA GAAAATT CT G TGTACGGGAT 540 

CCGTGGATTT AGCTCACATC CCATTAAAGT GGCTCCTTCA TAATCCTTGA TATAGCCAAC 600 

ATATTTGGTT TTAGGTATTT TAATTTCTTT GGAGAAACCC TGTTTCTTAA AGTATCCAAT 660 

T GC AT AT T CA T CT GC AT AT G TGAGGAAGTT CAGGATGTCA TGCTTTATGT GATATTCTTT 720 

CAAATGATTC AT CAGGT GT G TTCCATAGCC CTTGACTTGC TCATTTGAGG T T AC AGCACA 78 0 

GAAGACAATC TCTGTGAATC CTTGAGATGG GAACATACGG AAACAGATAC CACCAATAAC 840 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GC C GT GT GAT 9 00 

GTATTCTTTT GGCATTCGGG GCAGCTGGTG GGAGAAAACG TTCTGTAGGC CAACCAGCCA 960 

CAT CAGGAT C TTCTTGTTTG GTTT CTGGTT GAGGGAATTG CCAACCACGT GAAATTCAAT 1020 

TACACCCCTG CGCTCTTCCA ACCTTGCCGC CTCATCCCTG GCCGAGTGTG CTGACAGAAA 1080 

ATTGGTCTCT GGTCCAAGCA TTGCTGCAGG GTCCGTGATG GTAGACATAA CCTCGTTGAT 1140 

TAATTCCATC GGAATATCCC CCATAACTCG GGGTTTCTTG GCCTCCTCCA GAACAT GAGA 12 00 

ATCAGTCATT TTCCTCTTTT CTCCTGGGTT TGCCTCAAGT CCAGAAGAGG CTTTGCAGGC 12 60 

AGGACTGCTG CTCCCTGCGT TTGGCTGCTC AAGG GAAGAT GAGGTT GAAT T GT AT GAAAT 132 0 

TGTCCCAGCC ACAGGAGGTG GATTGATAAC TGTTTGGATG CCTAGCTGGC TGGTTCTGGA 1380 

AGAGGCT GAG AGAAAAT C CT GATCCCAGAT GGGAGAGTTT TGACTATATA CTTCTTCTTC 144 0 

TAGCATGGAC AGAAATTTTG GGAAAT GAGT GAGGATTAGA GTTCGTTTTT CAAGAGGCAG 150 0 

TTTATCTTTT TCCTGTCTTG CTTGTTCCAG GAGTTGTCGC CT CAT AAC AG TGAAGACCGA 1560 

GCGAAGCAAT GTTCTCCCAA ACACCT GT GT GGTTTCGTAC CGAGGTAGAC TGTCGCAGAA 162 0 

CTGTGGCACG TTGCAGTAAC ACAGCCACCT TGTGTAGTTC TCTTTGTATC CAGAAATATC 168 0 

ATCATTGGGA GAT CGCAGT C TTCGTTGAGA TGGTGCCTCC AGAT GCCAAT AGTTGATGCG 17 4 0 

GTTTAGGAAC ATTTTTGCCA ACTCAACTAT TGTTTGCCTT TCTTTTGCTG GCAGGTGACT 18 00 

AAATTT GT AC T GC ACAAAGT TAT T CAC AC C CTGTTCAATG CTAGGTTTTT CAAAT GGGGG 18 60 

TTTCTTTTCC AAAGAGCCTT CAACCACAGG TTTTCCTCTT T GT AAAATAG ACTTTCTCAA 192 0 

GAGCTTAAAT AGATAGAAAT AAACTTGTTT GGTATCTGCA TCTTCTTCCT TGTGGACACA 198 0 

GGTAAAGAGA TATTCCACAT CCAATACTAT TCCCAGGAGT CTGTTCATTT CTTCCTCTGA 2 04 0 

CACATTCTCC AGGT GGGAAA CAT GAGC AGC TAGGGCATGG CTACAACTCC GACAGG AT T C 2100 

TGTTAGACTG ACAATTATTT GCTGCAGGTC GGCTCTGGGG GGAGTGGGTG AGGGGTTAGG 2160 

GTTTTTCCAG CCATTACATT T AC AAGACT C CTCGGCCTTG CAGGCGGAGT ACACTCCGAG 222 0 

TTTCTCCAGT TTCTTGGCCC. GCGGAGCGGA GCGTAGTTGC GCTTTCTTCA CGGCGATTCG 22 8 0 

GGCCGAGCCA CCGCCTCCCG GTCCTTCGGC CGTGCCCGCT G C AG C C ACT G CCGTCGCCGG 2 34 0 

ACCGCAGGCG CCCGAGCCCC CGGCGGCAGC GGCGCAGGGG GAGCCCTGCG GGGGCGCGGG 2 400 
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CGGAAGCGCC GCAGGCTGCG GGGGCAGCGC CCCGGGCCCG GCCCCTGCCC CGGCTCCTGC 2460 
CCCGCAGCCG CCCGGCCCGG CCCCGCCAGC CTCGGACAT 249 9 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCACTTGTCA ATCAACCCTG CTTCCTTAAT TTTACTGAAG AAGAACTTCT CCAGGAT GCT 60 

GGCGCATTTG TAGTACTCGC TCTCGGGAGG GTTGTACTCC TTGCAGTTGG T GAAC ACT C G 12 0 

TTGCAAGTCC GC CAT GAAT A ACTTCTTAGA CACATAGTAC CTGTTCCTGA GGCGTTCACT 180 

CATGGTTTTC AGATCCATGG GGAACCTTAT AACTT CATAA TATCCCGGAG CTTCTGTTCT 2 40 

CTTCACTGGT T C CAT GAAAG GCCAAGCATT TGGATGGTTC TTCACCTGCT GCAGGAT GTT 300 

CTTGAGGGTG CTGTAAACGT GCTCAGGGTC TTTGGGCTCT TTACTTTTCT CTTTTCCACT 3 60 

TGGTTTCCAG CCTGTCTCTC TGATTCCAGG AATGCTTTCT AT AGGAAT CT GCCGAACTCC 42 0 

ATCTTTGAAA CACGAAAGTC CAGGGTAGAC TTTTCGAATC TGGGCTTGTT TTCTTTCTAT 480 

CAGCTTTTTA ATGATCTCCT TCTGCTTTTT AATGATGACA GAGAACTCTG T GT AT GGGAT 540 

CTGAGGGTTC AGCTCACATC CCAT CAAAGT GGCCCCTTCA TAATCCTTGA TGTAGCCAAC 600 

ATATTTGGTT TTAGGTATTT TGATTTCTTT GGAGAAACCC TGCTTCTTGA AATAGCCGAT 660 

GGCATACTCA TCTGCATATG TGAGGAAGTT GAGGATCTCG TGCTTTATGT GGTATT CTTT 720 

GAGATGGTTC AT CAGGT GGG TTCCATAGCC CTTGACTTGT TCATTTGAGG TTACTGCACA 7 80 

GAAAACAATC TCTGTGAATC CCTGGGATGG AAACATCCGG AAACAGATAC CACCAATGAC 84 0 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GC C GT GT GAT 90 0 

GTACTCTTTG GGCATTCTGG GCAGCTGGTG GGAAAACACA TTCTGGAGGC CCACGAGCCA 960 

CAT CAGGAT C TTCTTGTTTG GTTTCTGGTT CAGGGAGTTG CCCACCACGT GGAATT CAAT 1020 

GACACCCCTG CGTTCTTCCA GCCGTGCCGC CTCATCTCTG GCCGAATGGG CTGACAGAAA 108 0 

ATTGGTCTCT GGTCCAAGCA TCCCTGCAGG GT CT GT GAT G GTAGACATGA CCTCATTGAT 114 0 

CAATTCCACG GGAATATCCC CCATCACTCG AGATCTCTTG GCCTCCTCGG GAGC AT GAGA 1200 

GTTGTTCATT TTCCTCTTTT CTCCCGGGTT TGCTTCAAGC CCAGAAGAGC CTCTGCATCC 12 60 

AGGACTTGTT CTCCCTCCAT T GAT CT GCT C AT GGGAAGTT GAATTT GAAC T GAACAAT GC 1320 

TGTCCCAGTA ACAGGAGGAC TGATTACTGT TTGGATTCCT AGCGGGCTGG TTCTGGAAGA 138 0 

GGCT GAGAGA AAATCCTGAT CCCAGATAGG AGAATTTTGA CTATACACTT CTTCTTCCAA 14 40 

CAT GGAC AGA AACTTTGGGA AATGT GTGAG GATAAGCGTG CGTTTCTCAA GAGGCAGTTT 1500 

GTCTTTTTTC TGTCTGGCTT GTTCCAAGAG CTGTCGTCTC ATGAT GGTGA AGACCGAGCG 1560 

AAGCAAT GTT CTCCCAAACA CCTTTGTGGT TTCGTACCGA GGTAAGCTGT CACAGAACTG 162 0 

CGGTACATTG CAGTAGCACA ACCACCTTGT GTAGTTTTCC TT GT AT C C AG AG AT GT CAT C 168 0 

ATTGGGAGAC CGTAGTCTCC GCTGAGATGG AGCCTCCAGA TGCCAGTAGT TGATGCGGTT 174 0 

CAGAAACATC TTGGCCAGCT CGATCGTTGT CTGCCTCTCT TTCGATGGCA AGT GACTAAA 18 00 

CTTGTACTGC AC GAAGTT GT TCACACCCTG TT CAAT ACT G GGCTTCTCAA ATGGCGGCTT 1860 

CTTCTCCAAG GAGCCTTCAA CCACAGGTTT TCCT CTTTGT AAAATT GACT TTCTCAAGAG 192 0 

CTTGAATAGG TAGAAGTACA CTTGTTTGGT AT CT GC AT CT TCTTCTTTGT GGAC GCAGGT 1980 

GAAGAGGTAC TCCACATCCA ACACAATTCC CAGGAGT CT G TCCATCTCTT CCTCTGACAC 204 0 

ATTCTCCAAG TGAGAAACGT GAGCAGCAAG GGCATGGCTA CAGCTTCGAC AGGATTCTGT 2100 

CAAACTGACA ATTATCTGCT GGAGGTCTCC TCTTGGTGGA GTAGGAGAGG GGTTAGGGTT 2160 

CTTCCAGCCA TT GCATTTAC AGGACTCCTC TGCCTTGCAG GCGGAGTACA CGCCGAGTTT 2220 

CTCCAGCTTC TTCGCCCGCG GAGCAGAGCG CAACTGCGCC TTCTTCACGG CGATCCGGGC 228 0 

CGAGCCGCCT CCTCCCGGTC CCTCGGCGGT GCCCGCCGCG GCCACCGGCG TCGCTGGCCC 2340 

GCAGGAAGCA GAGCTCCCGG CAGCGGTGGC CAGGGTCCGG GGGGAACCGT GCGGGGGCGC 2 400 

GGGAGGCAGT GCTGGGGACC CGGCCCCGCC AGCCTCGGCC AT 2442 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCCGCCAGCC TCGGACATGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 17: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CCCGCCAGCC TCGGCCATGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATGGCCGAGG CTGGCGGGGC CGGGTCCCCA GCACTGCCTC CCGCGCCCCC GCACGGTTCC 60 

CCCCGGACCC TGGCCACCGC TGCCGGGAGC TCTGCTTCCT GCGGGCCAGC GACGCCGGTG 12 0 

GCCGCGGCGG GCACCGCCGA GGGACCGGGA GGAGGCGGCT CGGCCCGGAT CGCCGTGAAG 18 0 

AAGGCGCAGT TGCGCTCTGC TCCGCGGGCG AAGAAGCTGG AGAAACTCGG CGTGTACTCC 24 0 

GCCTGCAAGG CAGAGGAGTC CTGTAAATGC AATGGCTGGA AGAACCCTAA CCCCTCTCCT 300 

ACTCCACCAA GAGGAGACCT CCAGCAGATA ATTGTCAGTT TGACAGAATC CTGTCGAAGC 360 

TGTAGCCATG CCCTTGCTGC TCACGTTTCT CACTTGGAGA ATGTGTCAGA GGAAGAGATG 42 0 

GACAGACTCC TGGGAATTGT GTTGGATGTG GAGTACCTCT TCACCTGCGT C CACAAAGAA 4 80 

GAAGAT GCAG ATACCAAACA AGTGTACTTC T AC CT AT T C A AGCTCTTGAG AAAGTCAATT 5 40 

TTACAAAGAG GAAAACCTGT GGTTGAAGGC TCCTTGGAGA AGAAGCCGCC ATT T GAGAAG 600 

CCCAGTATTG AACAGGGTGT GAACAACTTC GTGCAGTACA AGTTTAGTCA CTTGCCATCG 660 

AAAGAGAGGC AGACAACGAT CGAGCTGGCC AAGATGTTTC TGAACCGCAT CAACTACTGG 72 0 

CAT C T GGAGG CTCCATCTCA GC GGAGACT A CGGTCTCCCA AT GAT GAC AT CTCTGGATAC 78 0 

AAGGAAAACT AC ACAAGGT G GTTGTGCTAC TGCAATGTAC CGCAGTTCTG TGACAGCTTA 8 40 

CCTCGGTACG AAACCACAAA GGTGTTTGGG AGAACATTGC TTCGCTCGGT CTTCACCATC 900 

AT GAGAC GAC AGCTCTTGGA ACAAGCCAGA CAGAAAAAAG ACAAACT GCC TCTTGAGAAA 960 

CGCACGCTTA TCCTCACACA TTTCCCAAAG TTTCTGTCCA TGTTGGAAGA AGAAGT GT AT 102 0 

AGT CAAAATT CTCCTATCTG GGATCAGGAT TTTCTCTCAG CCTCTTCCAG AACCAGCCCG 108 0 

CTAGGAATCC AAACAGTAAT CAGTCCTCCT GTTACTGGGA CAGCATTGTT CAGTT CAAAT 1140 

TCAACTTCCC AT GAG CAGAT CAAT GGAGGG AGAACAAGTC CTGGATGCAG AGGCTCTTCT 1200 

GGGCTTGAAG CAAAC CCGGG AGAAAAGAGG AAAATGAACA ACTCTCATGC TCCCGAGGAG 12 60 

GCCAAGAGAT CTCGAGTGAT GGGGGATATT CCCGTGGAAT T GAT CAAT G A GGTCATGTCT 132 0 

AC CAT CAC AG ACCCTGCAGG GATGCTTGGA CCAGAGACCA ATTTTCTGTC AGCCCATTCG 13 80 

GC CAGAGAT G AGGCGGCACG GCTGGAAGAA CGCAGGGGTG TCATTGAATT CCACGTGGTG 1440 

GGCAACTCCC TGAACCAGAA AC CAAACAAG AAGAT C CT GA TGTGGCTCGT GGGCCTCCAG 1500 

AATGTGTTTT CCCACCAGCT GC CCAGAAT G CCCAAAGAGT AC AT CACAC G GCTCGTCTTT 1560 

GACCCGAAAC ACAAAAC C CT TGCTTTAATT AAAGATGGCC GT GT CAT T GG TGGTATCTGT 162 0 

TTCCGGATGT TTCCATCCCA GGGATTCACA GAGATTGTTT TCTGTGCAGT AAC CT CAAAT 168 0 

■ GAACAAGTCA AGGGCTATGG AACCCACCTG AT GAAC CAT C T CAAAGAAT A CCACATAAAG 17 4 0 

CACGAGATCC TCAACTTCCT CACATATGCA GATGAGTATG CCATCGGCTA TTTCAAGAAG 18 00 

CAGGGTTTCT CCAAAGAAAT CAAAATAC CT AAAACCAAAT AT GTT GGCT A CAT CAAGGAT 18 60 

TAT GAAGGGG CCACTTTGAT GGGATGTGAG CTGAACCCTC AGATCCCATA CACAGAGTTC 192 0 

TCTGTCATCA TTAAAAAGCA GAAGGAGATC ATTAAAAAGC T GAT AGAAAG AAAACAAGCC 198 0 

CAGATT CGAA AAGTCTACCC TGGACTTTCG TGTTTCAAAG ATGGAGTTCG GCAGATTCCT 2 04 0 
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ATAGAAAGCA TTCCTGGAAT CAGAGAGACA GGCT GGAAAC CAAGT GGAAA AGAGAAAAGT 2100 

AAAGAGCCCA AAGACCCTGA GCACGTTTAC AGCACCCTCA AGAACATCCT GCAGCAGGTG 2160 

AAGAAC CAT C CAAATGCTTG GCCTTTCATG GAACCAGTGA AGAGAACAGA AGCTCCGGGA 222 0 

TATTATGAAG TTATAAGGTT C C C CAT GGAT CTGAAAACCA TGAGTGAACG CCTCAGGAAC 22 80 

AGGT ACT AT G TGTCTAAGAA GTTATT CAT G GCGGACTTGC AACGAGTGTT CACCAACTGC 2 34 0 

AAGGAGTACA ACCCTCCCGA GAGC GAGTAC TACAAATGCG CCAGCATCCT GGAGAAGTTC 240 0 

TTCTTCAGTA AAATTAAGGA AGCAGGGTTG ATT GACAAGT GA 2 442 
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What is claimed is: 

1. A purified protein designated P/CAF having a molecular weight of about 93,000 
daitons as determined by sodium dodecyl sulfate polyacrylamide gel electrophoresis 
under reducing conditions and which acetylates hist ones. 

2. The protein of claim 1 consisting of the amino acid sequence of SEQ ID NO: 1 . 

3. The protein of claim 1 comprising the amino acid sequence of SEQ ID NO: 2 

4. The protein of claim 1, which also binds to the amino acid sequence of SEQ ID 
NO:3 on a p300 cellular protein and to amino acid residues 1805-1854 of a CBP cellular 
protein (SEQ ID NO:9). 

5. A fragment of the protein of claim 1 having histone acetyltransferase activity 

6. A polypeptide consisting of the amino acid sequence of SEQ ID NO: 2 

7. A fragment of the protein of claim 1 which binds to the amino acid sequence of 
SEQ ID NO: 3 on the p300 cellular protein and the amino acid sequence of SEQ ED 
NO:9 on the CBP cellular protein 

8. A polypeptide consisting of the amino acid sequence of SEQ ID NO:4. 

9. A nucleic acid consisting of the nucleotide sequence of SEQ ID NO: 10. 

10. A nucleic acid having a nucleotide sequence which encodes the protein of claim 
L 
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A nucleic acid having a nucleotide sequence which encodes the protein of claim 



12. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

3. 

13. A nucleic acid consisting of the nucleotide sequence which encodes the protein 
of claim 4. 

14. A nucleic acid complementary to and which selectively hybridizes with the 
nucleic acid of claim 1 1 under stringent hybridization conditions. 

15. A fragment of the nucleic acid of claim 9, which encodes a polypeptide that 
acetylates histones. 

16. A fragment of the nucleic acid of claim 9, which encodes a polypeptide which 
binds to the amino acid sequence of SEQ ID NO:3 on the p300 cellular protein and the 
amino acid sequence of SEQ ID NO:9 on the CBP cellular protein. 

17. A purified antibody which specifically binds the protein of claim 1 

18. A purified antibody which specifically binds the protein of claim 2. 

19. A purified antibody which specifically binds the protein of claim 3 

20. A purified antibody which specifically binds the protein of claim 4. 

21 . An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of P/CAF comprising: 

a) contacting the substance with a system in which histone acetylation by 
P/CAF can be determined; 
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b) determining the amount of histone acetylation by P/CAF in the 
presence of the substance; and 

c) comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of P/CAF. 

22. An assay for screening substances for the ability to inhibit binding of P/CAF to 
p300/CBP comprising: 

a) contacting the substance with a system in which the P/CAF binding of 
P300/CBP can be determined; 

b) determining the amount of P/CAF binding of p300/CBP in the presence of 
the substance; and 

c) comparing the amount of binding of P/CAF to p300/CBP in the presence of 
the substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 

23. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the p300 protein comprising amino acid residues 
1767-1816 (SEQ ID NO:3) and the protein of claim 4. 

24. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising amino acid residues 
1805-1854 (SEQ ID NO:9) and the protein of claim 4. 

25. The method of claim 22, wherein the system consists of a cell extract produced 
from cells producing both p300 and P/CAF. 
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26. An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of p300/CBP comprising: 

a) contacting the substance with a system in which histone acetylation by 
p300/CBP can be determined; 

b) determining the amount of histone acetylation by p300/CBP in the 

presence of the substance; and 

c) comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

27. An assay for screening substances for the ability to inhibit binding of a DNA- 
binding transcription factor to p300/CBP comprising: 

a) contacting the substance with a system in which the DNA-binding 
transcription factor binding of P300/CBP can be determined; 

b) determining the amount of DNA-binding transcription factor binding of 
p300/CBP in the presence of the substance; and 

c) comparing the amount of binding of DNA-binding transcription factor to 
p300/CBP in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 
binding of DNA-binding transcription factor to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

28. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a DNA-binding transcription factor and p300/CBP. 

29. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising a DNA-binding 
transcription factor and p300/CBP. 
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30. The method of claim 27, wherein the system consists of a cell extract produced 
from cells producing both a DNA-binding transcription factor and p300/CBP. 

3 1 . The method of claim 27, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YY1, Sap- la, c-Fos, MyoD and SRC-1. 

32. A method for inhibiting the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
inhibiting amount of a substance in a pharmaceutically acceptable carrier. 

33. The method of claim 32, wherein the substance can inhibit the transcription 
modulating activity of P/CAF by preventing the binding of P/CAF to p300/CBP. 

34. A method for stimulating the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
stimulating amount of a substance in a pharmaceutically acceptable carrier. 

35. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by promoting the binding of P/CAF to p300/CBP. 

36. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by stimulating the histone acetlytransferase activity of 
P/CAF. 

37. A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
inhibiting amount of a substance in a pharmaceutically acceptable carrier. 
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38. The method of claim 37, wherein the substance can inhibit the transcription 
modulating activity of p300/CBP by preventing the binding of a DNA-binding 
transcription factor to p300/CBP. 

39. The method of claim 38, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YY1, Sap- la, c-Fos, MyoD and SRC-1. 

40. The method of claim 37, wherein the substance is an antibody which binds 
p300/CBP. 

41 . A method for stimulating the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
stimulating amount of a substance in a pharmaceutically acceptable carrier 

42. The method of claim 41, wherein the substance can stimulate the histone 
acetyltransferase activity of p300/CBP by promoting the binding of a DNA-binding 
transcription factor to p300/CBP. 
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